ReCdPCT/FTO 20 MAY 2005 

WO 2004/048550 PCT/US2003/038178 

IMMUME RESPONSE ASSOCIATED PROTEINS |} Q 35 63' 

TECHNICAL FIELD 
The invenjtion relates to novel nucleic acids, immune response associated proteins encoded by 
5 tbese nucleic acids, and to the use of these nucleic acids and proteins in the diagnosis, treatment, and 
preventiQn of immune system, neurological, devdopmental, muscle, cell proliferative disorders, and 
disorders of lipid noetabolism. The invention also relates to the assessment of the effects of 
exogenous compounds on fhe expression of nucleic adds and immune response associated protdns. 



10 BACKGROUND OF THE INVENTION 

All vertebrates have developed sophisticated and conqdex immune systems that provide 
protection firom viral, bacterial, fungal and parasitic infections. Included in these systems are the 
processes of humoral immunity, the con^lement cascade and the inflammatory response (See Paul, 
(1993) Piindatriftntal TtntmitinlogY, Raven Press, Ltd, New York NY pp. 1-20). 

15 The cellular cotqponents of the immune syst^include six dififerent types of leukocytes, or 

white blood ceDs: monocytes, lynqthocytes, polymorphonuclear granulocytes (including neutrophils, 
eosinophils, and basophils) and plasma cells. Additionally, fragments of megalraryocytes, a seventh 
type of white blood c^ in the bone marrow, occur in large nunoibers in the blood as platelets. 

Leukocytes are formed from two stem cell lineages in bone marrow. The myeloid stem cell 

20 lineproduces granulocytes and monocg^ and the lymphoid stem cell line produces lynq>liocytes. 
Lyxrpboid cells travel to the thymus, spleen and lynq>li nodes, where they mature and dififerendate 
into lyn^oc^tes. Leukocytes are responsible for defending the body against invading pathogens. 
Neutrophils and monocytes attack invading bactma, viruses, and other pathogens and destroy them 
by phagocytosis. Monocytes ent^ tissues and differmtiate into macrophages which are extr^ndy 

25 phagocytic. Lyn^hocytes and plasma cells are a part of the immune system which recognizes 

specific fordign molecules and organisms and inactivates them, as well as signals oth^ cells to attack 
the invaders. 

Granulocytes and monocytes are fonned and stored in the bone marrow until needed. 
Megakaryocytes are produced in bone marrow, where they firagment into platelets and are released 
30 into the bloodstream. The main function of platelets is to activate the blood clotting mechanisna. 
Lymphocytes and plasma cells are produced in various lyn9>hogenous organs, including the lyn^>h 
nodes, sple^ thymus, and tonsils. 

Both neutrophils and macrophages exhibit chemotaxis towards sites of inflammation. Tissue 
inflamooation in response to pathogen invasion results in production of chenK>-attractants for 
35 leukocytes, such as endotoxins or other bacterial products, prostaglandins, and products of leukocytes 



1 



wo 2004/048550 PCTAJS2003/038178 
or platelets. 

Basophils participate in the release of the ch^mcals iovolved in the inflammatory process. 
The TTiam function of hasophfls is secretion of these chemicals, to such a degree that ttey have been 
referred to as **unicellular endocrine glands." A distinct aspect of basophilic secretion is that the 

5 contents of granules go directly into tbe extracelhilar environment^ not into vacuoles as occurs with 
neutrophils, eosinophils, and monocytes. Basophils have receptors for the Fc fragment of 
immunoglobulin E (IgE) that are not present on other leukocytes. Crosslinldng of menibrane IgE with 
anti-IgB or other ligands triggers degranulatioxt 

Eosinophils arebi- or multi-nucleated white blood cells which contain eosinophilic granules. 

10 Their plasma mmibrane is characterized by Ig receptors, particularly IgG and IgE. Generally, 

eosinophils are stored in the bone marrow until recruited for use at a site of inflammation or invasion. 
They have specific functions in parasitic infections and allergic reactions, and are thought to detoxify 
some of the substances released by mast cells and basophils which cause inflammation. Additionally, 
they phagocytize antigenrantibody complexes and further hdp prevent the spread of inflammation. 

15 The nxmonuclear phagocyte system is conq>rised of precursor cells in the bone marrow, 

monocytes in drculation, and macrophages in tissues. Macrophages are monocytes that have left the 
blood stream to settle in tissue. Once monocytes have migrated into tissues, Ibey do not re-enter the 
bloodstream. Th^ increase several-fold in size and transform into macrophages that are 
characteristic of the tissue they have entered, surviving in tissues for several months. The 

20 mononuclear phagocj^ system is capable of very fast and extensive phagocytosis. A macrophage 
may phagocytize over 100 bacteria, digest them and extrude residues, and iben survive for many 
more monSis. Macrophages are also capable of ingesting large particles, including red blood cells 
and malarial parasites. 

Mononuclear phagocytes are essential in defending the body against invasion by foreign 
25 pathogens, particularly intracellular microorganisms such as Afycobacterwm tuberculosis^ list^a, 
leishmania and toxoplasma. Macrophages can also control the growth of tumorous cells, via bofli 
phagocytosis and secretion of hydrolytic enzymes. Another important function of macrophages is 
that of processing antigens and presenting them in a biochemically modified form to lymphocytes. 
The immune system responds to invading microorganisms in two major ways: antibody 
30 production and cell mediated responses. Antibodies are immunoglobulin proteins produced by 

B-lymphocytes which bind to specific antigens and cause inactivation or promote destmction of tixe 
antigen by other cells. Cell-mediated immune responses involve T-lynq)hocytes (T cells) that react 
with foreign antigens on die surface of infected host cells. Depending on the type of T ceU, the T cell 
either kills the infected cell itself, or secretes signals which activate macrophages and other cells to 
35 destroy the infected cell (Paul, supra). 



2 



wo 2004/048550 



PCTAJS2003/038178 



T-lyii5)liocytes originate in flie bone manow or liver in fetuses. Precursor cells migrate via 
the blood to the thymus, where they are processed to mature into T-lymphocytes. This processing is 
crucial because it involves positive and negative selection of T cells for fliose that wiH react with 
foreign antigen and not ^th self molecules. After processing, T cells continuously circulate in the 
5 blood and secondaiy lymphoid tissues, such, as lymph nodes, spleen, certain epithelinmrassociated 
tissues in the gastrointestinal tract, respiratory tract and sldn. Whm T4ynq>liocytes are presented 
with the con^lementaiy antigen, they are stinnilated to proliferate and release large numbers of 
activated T cells iato tiie lywpk system and the blood system. These activated T cells can survive 
and circulate for several days. At flie same time, T memory cells are created, which remain in the 

10 lynq)hoid tissue for months or years. Upon subsequent exposure to that specific antigen, these 
memory cells wiQ respond more rapidly and with a stronger response than induced by the original 
antigen. Ihis oreates an "inmnunological memory" that can provide immomity for years. 

There are two major types of T cells: cytotoxic T cells destroy infected host cells, and helper 
T cells activate other white blood cells via chenoical signals. One class of helper cell, Th1» activates 

15 macrophages to destroy ingested microorganisms, while another, Th2, stimulates the production of 
antibodies by B cells. 

Cytotoxic T cells directiy attack the infected target cell. Receptors on the surface of T cells 
bind to antigen presented by MHC molecules on the surface of the infected cell. Once activated by 
binding to antigen, T cells secrete y-interferon, a signal molecule that induces the expression of genes 
20 necessary for presenting viral (or other) antigens to cytotoxic T cells. Cytotoxic T cells kill the 
infected cell by stimulating programmed cell death. 

Helper T cells constitute up to 75% of flie total T cell population. They regulate the immune 
functions by producing a variety of lyxr^hokines that act on other cells in the immune system and on 
bone marrow. Among these lymphokines are intedeukins 2 through 6, granulocyte-monocyte colony 
25 stimulating factor, and Y-iuterferon. 

Helper T cells are required for most B cells to respond to antigen. When an activated helper 
cell contacts a B cell, its centrosome and Golgi apparatus become oriented toward the B cell, aiding 
the directing of signal molecules, such as a transmembrane-boimd protein called CD40 ligand, onto 
the B cell surface to interact vnSx the CD40 transmembrane protein. Secreted signals also help B 
30 cells to proliferate and mature and, in some cases, to switch the class of antibody being produced. 

B-lymphocytes (B cells) produce antibodies which react with specific antigenic proteins 
presented by pathogens. Once activated, B cells become filled wifli extensive rough endoplasmic 
reticulum and are known as plasma cells. As with T cells, interaction of B cells with antigen 
stimulates proliferation of only those B cells which produce antibody specific to that antigen. There 
35 are five classes of antibodies, known as immunoglobulins, which together comprise about 20% of 



3 



wo 2004/048550 PCTAJS2003/038178 

total plasma protein. Each class mediates a characteristic biological xesponse after antigen binding. 
Upon activation by specific antigen B cells switch from making the menibiane-boiind antibody to Ihe 
secreted form of that antibody. 

Antibodies, or immimog^obiiliQs, are Oie founding merobers of the immunoglobiilin (Ig) 
5 superfamily and the central components of the humoral immune response. Antibodies are either 
expressed on the surface of B cells or secreted by B cells into the circulation. Antibodies bind and 
neutralize blood-bome foreign antigens. The prototypical antibody is a tetramer consisting of two 
id^iittcal heavy polypeptide chaios (H-chains) and two identical light polypeptide chains (L-chains) 
iotedinked by disulfide bonds. Ibis arrangement confers the charact^istic Y-shape to antibody 

10 molecules. Antibodies are classified based on thdrH-chain composition. The five antibody classes, 
IgA, IgD, IgE, IgG and IgM, are defined by the a, 6, y» ^ H-chain types. There are two ^pes 
of L-chains, k and A, either of which may associate as a pair with any H-chain pair. IgG, the most 
common class of antibody found in the circulation, is tetFameric, while the other classes of antibodies 
are graerally variants or multimiers of this basic structure. 

15 H-chains and L-chains each contain an N-tercainal variable region and a C-terminal constant 

region. The constant region consists of about 1 10 amino adds in L^ains and about 330 or 440 
amino acids in H-chains. The amino acid sequence of die constant region is neady identical amo ng 
H- or Lrchains of a particular class. The variable region consists of about 1 10 amino acids in. both H- 
and L-chains. However, die amino acid sequence of the variable region differs among H- or L-chains 

20 of a particular class. Within each H- or L-chain variable region are three hypervariable regions of 
extensive sequence diversity, each consisting of about 5 to 10 amino acids. In the antibody molecule, 
the H- and L-chain hypervariable regions come together to form the antigen recognition site. 
(Reviewed in Alberts, B. et al. (1994) Molecular Biology of the Cell> Garland Publishing, New York, 
NY,pp.l206-1213 and 1216-1217.) 

25 The immune system is csqpable of recognizing and responding to any foreign molecule that 

CTters the body. Therefore, the tTnmiinft system must be armed with a fiill repertoire of antibodies 
against all potential antigens. Such antibody diversity is generated by somatic rearrangement of gene 
segments encoding variable and constant regions. These gene segments are joined together by site- 
speciGc reconobination which occurs between highly conserved DNA sequences tiiat flank each gene 

30 segment. Because there are hundreds of different gene segments, loillions of tmique genes can be 
generated combinatoriaUy. In addition, imprecise joining of these segments and an unusually high 
rate of somatic mutation widiin these segments furtiier contribute to the generation of a diverse 
antibody population. 

Both H-chains and L-chains contain repeated Ig domains. For example, a typical H-chain 
35 contains four Ig domains, three of which occur within the constant region and one of which occurs 



4 



wo 2004/048550 PCT/US2003/038178 

within the variable region and contributes to the formation of the antigen recognition site. likewise, 
a typical L-chain contains two Ig domains, one of which occurs within the constant region and one of 
which occurs within the variable region. In addition, H chains such as have been shown to 
associate with odier polypeptides during differentiation of the B-ceU. 
5 Antibodies can be described in terms of their two main functional domains. Antigen 

recognition is mediated by flie Fab (antigen binding fragment) region of the antibody, while effector 
functions are mediated by fh& Fc (crystallizable fragment) region. Binding of antibody to an antigen, 
such as a bacterium, triggers the destruction of flie antigen by phagocytic white blood cells such as 
macrophages and neutrophils. These cells express surface receptors that specifically bind to the 

. 10 antibody Fc region and allow the phagocytic cells to engulf, ingest, and degrade the antTbody-bonnd 
antigen. The Fc receptors expressed by phagocytic cells are single-pass transmembrane 
glycoproteins of about 300 to 400 armno acids (Sears, D.W. et al. (1990) J* hnnSunol. 144:371-378). 
The extracellular portion of the Fc receptor typically contains two or three Ig doniains. 

Diseases which cause over- or under-abnndance of any one type of leukocyte usually result in 

15 the entire immune defense system becoming involved. The most notorious autoimmune disease is 
AIDS (Acquired hnmunodeficiency Syndrome). This disease depletes the nimfber of helper T cells 
and leaves the patient susceptible to infection by microorganisms and parasites. 

Another widespread medical condition attributable to the immune system is that of allergic 
reactions to certain antigens. Delayed reaction allergy is experienced by many genetically normal 

20 people. In Ihe case of atopic allergies, fliere is a genetic origin, such that large quantities of IgE 

antibodies are produced. IgEs have a strong tendency to attach to mast cells and basophils, up to half 
million each (IgBAnast) which then rupture and release histamine, leukotrienes, eosinophil 
chemotactic substance, protease, neutrophil chemotactic substance, heparin, and platelet activation 
factors. Tissues can respond in a nuinber of ways to Ihese substances resulting in what are commonly 

25 known as allergic reactions: hay fever, asthma, anaphylaxis, and luiicaria Olives). 

Leukemias are an excess production of white blood cells, to the point where a major portion 
of ttie body's metabolic resources are directed solely at proliferation of white blood cells, leaving 
other tissues to starve. With lynq>hogenoiis leukemias, cancerous lymphogenous cells spread from a 
lymph node to other body parts. Excess T- and B-lymphocytes are produced. In myelogenous 

30 leukemias, cancerous young myelogenous cells spread firom the bone marrow to otiier organs, 

especially the spleen, liver, lymph nodes and other highly vascularized regions. Usually, the extra 
leukemic cells released are immature, incapable of function, and undifferentiated. Occasionally, 
partially differentiated cells are produced, leading to classification of the disease as neutrophilic 
leukemia, eosinophilic leukemia, basophilic leukemia, or monocytic leukemia. Leukemias may be 

35 caused by exposure to environmental factors such as radiation or toxic chemicals or by genetic 



5 



wo 2004/048550 



PCT/US2003/038178 



abenatioiL 

Leukopenia or agranolocytDsis occurs when the bone marrow stops producing white blood 
cells. This leaves the body unprotected against foreign microorganisms, including those which 
normally iohabit skin, mucous membranes, and gastrointestinal tract If all white blood cell 
5 production stops completely, infection wiU occur wifliin two days and death may follow only 1 to 4 
days later. Acute leukopenia can be caused by e::qx>sure to radiation or chemicals contaimng 
benzene. Occasionally, drags such as chloramphenicol and tfaiouraciL can suppress blood cell 
production by the bone manow and initiate the onset of agranulocytosis. In cases of monoblastic 
leukemia, primitive monocytes in blood and bone marrow do not mature. Clinical symptoms reflect 
10 this abnormality: high lysozyme levels in blood seram, renal tubular dysfunction, and high fevers. 

Impaired phagocytosis occurs in several diseases, including monocytic leukemia, systemic 
lupus, and granulomatous disease. In such a situation, macrophages can phagoc3^e normally, but 
the enveloped organism is not kOled. There is a defect in the plasnia membrane enzyme which 
converts oxygen to lethally reactive forms. This results in abscess formation in liver, lungs, spleen, 
15 lynq)h nodes, and beneath the sldn. 

Eosinophilia is an excess of eosinophils commonly observed in patients with allergies (hay 
fever, asthma), allergic reactions to drags, rheumatoid arthritis, and cancers (Hodgkins disease, lung, 
and liver canc^). The mechanism for elevated levels of eosinophils in these diseases is tmknown 
(Jsselbacher, K.J. et al. (1994) Harrison*s Principles of Internal Medicine, McGraw-Hill, Inc., New 
20 Yorit,NY). 

Host defense is further augmented by the conq)lement system. Ihe conq>lement system 
serves as an effector system and is involved in infectious agent recognition. It can function as an 
independent immune network or in conjonclion with other humoral immune responses. The 
complement system is conq>rised of numerous plasma and memibrane proteins that act in a cascade 

25 of reaction sequences whereby one component activates the next. The result is a rapid and amplified 
response to infection through either an inflammatory response or increased phagocytosis. 

Ihe conqdement system has more than 30 protein components which can be divided into 
functional groupings including modified serine proteases, membrane-binding proteins, and 
regulators of complement activation. Activation occurs through two different pathways, the 

30 classical and the alternative. Both pathways serve to destroy infectious agents through distinct 
triggering mechanisms that eventually merge with the involvement of the component C3. 

Ihe classical pathway requires antibody binding to infectious agent antigens. The 
antibodies serve to define the target and initiate the complement system cascade, culminating in the 
destmction of the infectious agent. In this pathway, since the antibody guides initiation of the 

35 process, the complement system can be seen as an effector arm of the humoral immune system. 



6 



wo 2004/048550 



PCT/US2003/038178 



The alternative pathway of flie conqjlement system does not require the presence of pre- 
existing antibodies for targeting infectious agent destructLon. Raflier, this pafliway, through low 
levels of an activated componmt, remains constantly primed and provides surveillance in the non- 
immune host to enable targeting and destruction of infectious agents. In this case foreign material 
5 tdggers the cascade, thereby fadlitating phagocytosis or lysis (Paul, supra pp.918-919). 

Another important conq)onent of liost defense is the process of inflamation. Inflammatory 
responses are divided into four categories on the basis of pathology and include allergic 
inflammation, cytotoxic antibody mediated inflammation, inomune complex mediated inflanamation, 
and monocyte mediated inflammation. Inflanmiation manifests as a coxnbination of each of these 

10 forms wifli one predominating. 

AQeigic acute inflamation is observed in individuals wherein specific antigens stimulate IgE 
antibody production. Mast cells and basophils are subsequently activated by the attachment of 
antigen-IgE con^lexes, resulting in the release of cytoplasmic granule contents such as histamine. 
The products of activated noast cells can increase vascular permeability and constrict the smootibi 

15 muscle of breafliing passages, resulting in anaphylaxis or asthma. 

Acute inflamation is also mediated by cytotoxic antibodies and can result in the destruction 
of tissue through the binding of complement-fixing antibodies to cells. In ttiis case the antibodies 
responsible are of the IgG or IgM types and resultant clinical disorders including autoimmune 
hemolytic anemia and thronibocytopenia as associated with systemic lupus erytiiematosis. 

20 Immune conqdex mediated acute inflammation involves the IgG or IgM antibody types 

which corobine with antigen to activate the complement cascade. When such imniune complexes 
bind to neutrophils and macrophages they activate the req>iratory huist to form protein and vessel 
damag ing ag&nts such as hydrogen peroxide, hydroxyl radical, hypochlorous acid, and chloramines. 
Clinical manifestations include ihemnatoid arthritis and systemic lupus erythematosus. 

25 In chronic inflammation or delayed-type hypersensitivity, macrophages aro activated and 

process antigen for presentation to T cells that subsequently produce lympholdnes and monokines. 
Ihis type of inflammatory response is likely important fer defense against intracellular parasites and 
certain viruses. Clinical associations include granulomatous disease, tuberculosis, leprosy, and 
sarcoidosis (Paul, supra pp. 1017-101 8). 

30 Most cell surface and soluble noolecules that noediate functions such as recognition, adhesion 

or binding have evolved from a common evolutionary precursor (i.e., these proteins have stmctural 
homology). A nuinber of molecules outside the immune system that have similar functions are also 
derived from this same evolutionary precursor. These molecules are classified as members of the 
immunoglobulin (Ig) superf amily. The criteria for a protein to be a member of the Ig superf amily is 

35 to have one or more Ig domains, which are regions of 70-1 10 amino acid residues in length 



7 



wo 2004/048550 



PCT/US2003/038178 



boimlogous to either Ig variable-like (V) or Ig constant-like (C) domains. Menibers of the Ig 
superfamily include antibodies (Ab), T cell receptors (TCRs), class I and n major histocompatibility 
(MHC) proteins, CD2, CDS, CD4, CDS, poly-Ig receptors, Fc receptors, neural cell-adhesion 
molecule (NCAM) and platelet-derived growth factor receptor (PDGFR). 

5 Ig domains (V and C) are regions of conserved amino acid residues diat give a polypeptide a 

globular tertiary structore called an immunoglobulin (or antibody) fold» which consists of two 
approximately parallel layers of P-sheets. Conserved cysteine residues form an iotrachain disuliide- 
bonded loop, 55-75 amino acid residues in length, which coimects the two layers of the P-sheets. 
Each P-sheethas three or four and-parallel P-strands of 5-10 amino acid residues. Hydrophobic and 

10 hydrophilic interactions of amino acid residues within the P-strands stabilisse the Ig fold 

(hydrophobic on inward facing airdno acid residues and hydrophilic on the amiao acid residues in 
the outward facing portion of the strands). A Y domain consists of a longer polypeptide than a C 
domain, with an additional pair of P-strands in the Ig fold. 

A consistent feature of Ig superfamily genes is that each sequence of an Ig domain is 

15 encoded by a single exon. It is possible that the superfamily evolved fiom a gene coding for a single 
Ig domain involved in mediating cell-cell interactions. New menoibers of the superfamfly then arose 
by exon and gene duplications. Modem Ig superfamily proteins contain different nunibers of V 
and/or C domains. Another evolutionary feature of flds superfamily is the ability to undergo DNA 
rearrangements, a unique feature retained by the antigen receptor members of the family. 

20 Many meidbers of Ihe Ig superfamily are integral plasma membrane proteins with 

extracellular Ig domains. Hie hydrophobic amiuo acid residues of their transmembrane domains and 
their cytoplasmic tails are very diverse, with little or no homology among Ig family members or to 
known signal-transducing structures. Ihere are exceptions to this general superfamily descriptioiL 
For example, the cytoplasmic tail of PDGFR has Qrrosine Idnase activity. In addition Thy-1 is a 

25 glycoprotein found on thymocytes and T cells. This protein has no cytoplasmic tail, but is instead 
attached to the plasma membrane by a covalent glycophosphatidylinositol linkage. 

Another common feature of many Ig superfamily proteins is the interactions between Ig 
domains which are essential for the function of these molecules, hxteractions between Ig domains of 
a multimeric protein can be ^ther homophilic or heterophilic (i.e. , between flie same or different Ig 

30 domains). Antibodies are multimeric proteins which have both homophilic and heterophilic 

interactions between Ig domains. Pairing of constant regions of heavy chains forms the Fc region of 
an antibody and pairing of variable regions of light and heavy chains form the antigen binding site of 
an antibody. Heterophilic interactions also occur between Ig domains of different molecules. These 
interactions provide adhesion between cells for significant cell-cell interactions in the immune 

35 system and in the developing and mature nervous system. (Reviewed in Abbas, AK. et al. (1991) 



8 



wo 2004/048550 



PCT/US2003/038178 



CelMar and Molecu lar Tmmii nologv, W.B. Saondeis Con^any, Phfladelphia, PA, pp.142-145.) 

Su&hi domains, also known as conq)lenient control protein (CCP) modules, or short 
consensus repeats (SCR), are fonnd in a wide vanety of conq)lement and adhesion proteins. CD21 
(also called C3d receptor, CR2, Epstein Barr virus receptor or EE V-R) is the receptor for EB V and 
5 for C3d, C3dg and iC3b. ConoplemfiDt components may activate B cells through CD21. CD21 is 
part of a large signal-transduction complex that also involves CD19, CD81, and Leul3. Some of the 
proteins in this group are responsible for the molecular hasis of the blood group antigens, surface 
maikers on the outside of the red blood ceU membrane. Most of these markers are proteins, but 
some are carbohydrates attached to lipids or proteins (for a review see Reid, M.R and C. 

10 Lomas-Frands (1977) The Blood Qrovm Antigen FactsBook Academic Press, San Diego, CA). 
Con^lement decay-accelerating factor (Antigen CDSS) belongs to the Cromer blood group system 
and is associated with Cr(a), Dr(a), Es(a), Tc(a/b/c), Wd(a), WES(a/b), IFC and UMC antigens. 
(3onq>lement receptor type 1 (C3b/C4b receptor) (Andgm CD3S) belongs to the Knops blood group 
system and is associated with Kh(a/b), McC(a), Sl(a) and Yk(a) antigens. 

15 Human leukocyte-specific transcript 1 (LSTl) is a smaU protein that modulates immune 

responses and cellular morphogenesis. LSTl is expressed at high levels in dendritic cells. A 
DNA-binding site and interaction of multiple regulatory elements may be involved in mediating the 
expression of the various forms of LSTl mRNA (Yu, X. and Weissman, S.M. (2000) J. BioL Chem. 
275:34597-34608). 

20 Spalpha is a merdber of fiie scav^iger receptor c^stdne-rich (SRCR) family of proteins. 

Spalpha is expressed only in lymphoid tissues, where it is implicated in monocyte activity (Oebe, 
JA. (1997) J. Biol. Chem. 272:6151-6158). 

A family of m^alloproteases, the ADAMs (for A Disintegrin and Metalloprotease Domain), 
has been shown to play a role in the imnnniTift system (Yamamoto, S. et al.(1999) Immanol. Today 

25 20:278-84). These protdns share with their close relatives the adamalysins, snake venom 

metaUoproteases (SVMPs). ADAMs combine features of both cdH surface adhesion molecules and 
proteases, containing a prodomain, a protease domain, a disintegrin domain, a cystdne rich domain, 
an epidermal growth factor repeat, a transmembrane domain, and a cytoplasmic tail The first three 
domains listed above are also found in the SVMPs. The ADAMs possess four potential functions: 

30 proteolysis, adhesion, signaling and fusion. The ADAMs share the metzindn zinc binding sequence 
and are inhibited by some MMP antagonists such as TIMP-1. 

ADAMs are in9)licated in such processes as sperm-egg binding and fusion, nq^oblast fusion, 
and protdn-ectodomain processing or shedding of cytokines, cytokine receptors, adhesion proteins 
and other extracellular protein domains (Schl5ndorff, J. and CP. Blobel (1999) J. Cell. Sci. 

35 1 12:3603-3617). The Kuzbanian protein cleaves a substrate in the NOTCH pathway (possibly 



9 



wo 2004/048550 



PCT/US2003/038178 



NOTCH itself), activating the program for lat^al iiihibition in Drosophila neural developmenL Two 
ADAMs, TACE (ADAM 17) and ADAM 10, are proposed to have analogous roles in the processing 
of amyloid precursor protein in the brain (Schlbndorfif and Blobel, supra). TACE has also been 
identified as the TNF activating enzynae (Black, R.A. et al. (1997) Nature 385:729-733), TNF is a 
5 pleiotropic cytokine that is inqportant in mobilizing host defenses in response to infection or trauma, 
but can cause severe damage in excess and is often overproduced in autoimmune disease. TACE 
cloves membrane-bound pro-TNF to release a soluble form Other ADAMs may be involved in a 
sinalar type of processing of other meanbrane-bound molecules. 

Expression prnfilinp; 

10 Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of 

mplecides spatially distributed over, and stably associated with, the surface of a solid support 
Microarrays of polypeptides, polynucleotides, and/or antibodies have been developed and find use in 
a variety of applications, such as gene sequencing, monitoring gene expression, gene mapping, 
bacterial identification, drug discovery, and combinatorial cheonstiy. 

15 One area in particular in i^cQinoicroatrays find use is in gene exp^ Array 

technology can provide a single way to explore the expression of a single polymorphic gene or the 
expression profile of a large nuniber of related or unrelated genes. Whm the expression of a single 
gene is examined, arrays are enq>loyed to detect the expression of a specific gene or its variants. 
When an expression profile is examined, arrays provide a platform for identif^g genes that are 

20 tissue specific, are affected by a substance bding tested in a toxicology assay, are part of a signaling 
cascade, carry out houselos^tng functions, or are spedfically related to a particular genetic 
predisposition, condition, disease, or disorder. 
Breast Cancer 

More than 180,000 new cases of breast canc^ are diagnosed each year, and the mortality 
25 rate for breast cancer approaches 10% of all deaths in females between the ages of 45-54 (Gish, K. 
(1999) AWIS Magasdne 28:7-10). However* the survival rate based on early diagnosis of localized 
breast cancer is extremely high (97 %), compared wilh the advanced stage of the disease in which the 
tumor has spread beyond the breast (22%). Current procedures for dinical breast examination are 
lacking in sensitivity and specificity, and efforts are underway to develop coniprehensive gene 
30 expression profiles for breast cancer that may be used in conjunction with conventional soreening 
methods to improve diagnosis and prognosis of this disease ^erou, CM. et al. (2000) Nature 
406:747-752). 

Mutations in two genes, BRCAl and BRCA2, are known to greatly predispose a woman to 
breast canc^ and may be passed on fi:om parents to children (Gish, supra). However, this typo of 
35 h^editary breast cancer accounts for only about 5% to 9% of breast cancers, while the vast majority 

10 



wo 2004/048550 PCT/US2003/038178 

of breast cancer is due to non-iiiherited mutatioiis that occur in breast epithelial cells. 

The relationship between expression of epidermal growth factor (EGF) and its receptor, 
F/tFR, to hiimflTi Tnammar y carcinoTna has been parriciilarTy wpJI studied. (See Khazaie, KL et al. 
(1993) CancCT and Metastasis Rev. 12:255-274, and references cited th^in for a review of this 
5 area.) Overexpression of EGFR, particularly coupled with down-regulation of the estrogoi receptor, 
is a marker of poor prognosis in breast cancer patients. In addition, EGFR expression in breast 
tumor metastases is firequently devated relative to the primary tumor, suggesting that EGFR is 
involved in tunoor progression and metastasis. This is supported by accumulating evidence that EGF 
has effects on cell functions related to metastatic potential, such as cell motility, chemotaxis, 

10 secretion and differentiation Qianges in expression of other meinbers of the erbB receptor family, 
of whicli EGFR is one, have also been hxplicated in breast cancer. The abundance of erbB 
receptors, such as HER-2/neu, HER-3, and HER-4, and thdr ligands in breast cancer points to fheir 
functional importance in the pathogenesis of the disease, and may therefore provide targets fi>r 
therapy of the disease (Bacos, S.S. et al. (1994) Am. J. Clin. Pathol. 1Q2:S13-S24). Other known 

15 matki^ of breast cancer include a human secreted frizzled protdn mRNA that is downregulated in 
breast tumors; the matrix Gl a protdn which is overexpressed in human breast carcinonoa cells; Drgl 
or RTF, a gene whose es^ression is diminished in colon, breast, and prostate tumors; maspin, a 
tumor suppressor gene downregulated in invasive breast carcinomas; .and CaN19, a mmib^ of the 
SlOO protein family, all of which are down-regulated in mammary carcinoma cdls relative to normal 

20 mammary epithelial cells CZhou, Z. et aL (1998) hit J. Cancer 78:95-99; Chen, L. et al. (1990) 
Oncogene 5:1391-1395; Uhix, W. et al (1999) FEES Lett 455:23-26; Sager, R. et al. (1996) Curr. 
Top. Microbiol. Imraunol. 213:51-64; and Lee, S.W. et al. (1992) Proc. Natl. Acad. Sci. USA 
89:2504-2508). 

Cell lines derived fiomhuman mammary epithelial cells at various stages of breast canc^ 
25 provide a useful model to stady the process of malignant transformation and tumor progression as it 
has been shown that these cell lines retain many of the properties of thdr parental tunvirs for lengthy 
culture periods (Wistuba, LI. et al. (1998) Clin. Cancer Res. 4:2931-2938). Such a model is 
particularly useful for comparing phCTotypic and noolecular characteristics of human mammary 
epithelial cells at various stages of malignant transformation. 
30 Colon cancer 

While soft tissue sarcomas are rdlatively rare, more than 50% of new patients diagnosed 
with the disease will die from it. The molecular pathways leading to the development of sarconoas 
are relativdy unknown, due to the rarity of the disease and variation in pathology. Colon cancer 
evolves through a multi-step process whereby pre-malignant colonocytes undergo a relatively 
35 d^ned sequence of ev^its leading to tumor formation. Several factors participate in the process of 



11 



wo 2004/048550 PCT/US2003/038178 

tiUDor progression and malignant transformation including genetic factors, mutations, and selection. 

To understand the nature of gene alterations in colorectal cancer, a number of studies have 
focused on the inherited syndromes. FamQial adenomatous polyposis (FAP), is caused by mutations 
in the adenomatous polyposis coli gene (APC), resulting in truncated or inactive forms of the 
5 prbteLa This tumor suppressor gene lias been mapped to chrormsomeSq. Hereditary noiq)olyposis 
colorectal cancer (HNPCC) is caused by mutations in mis-match repair genes. Although hereditary 
colon cancer syndromes occur in a small percentage of the population and most colorectal cancers 
are considered sporadic, knowledge from studies of the hereditary syndromes can be generally 
applied. For instance, somatic imitations in APC occur in at least 80% of sporadic colon tumors. 
10 APC mutations are thou^ to be the initiating event in the disease. Other mutations occur 

subsequently, ^proximately 50% of colorectal cancers contain activating mutations in ras, wMe 
85% contain inactivating mutations in p53. Changes in all of these gooes lead to gsnQ expression 
changes in colon cancer. 
Lung cancer 

15 Lung canc^ is the leading cause of cancer death for men and the second leading cause of 

cancer death for women in the U.S. Lung cancers are divided into four histopathologically distinct 
groups. Hiree groups (squamous cell carcinoma, adenocardnoma, and large cell cardnoma) are 
classified as non-small cell lung cancers (NSCLCs). The fourth gronp of canc^ is referred to as 
small ceQ lung cancer (SCLC). Deletions on chromosome 3 are common in this disease and are 

20 thouglht to indicate the presence of a tumor suppressor gene in this re^on. Activating mutations in 
K-ras are commonly found in lung cancer and are the basis of one of the mouse models for the 
disease. 
Obesity 

The potential application of g^ne expression profiling is particularly relevant to incproving 
25 diagnosis, prognosis, and treatmmt of disease. For exan^le, both the levels and sequraces 

expressed in tissues from subjects with obesity or type n diabetes may be conqpared with the levds 
and sequence expressed in normal tissue. 

The primary function of adipose tissue is the ability to store and rdease fat during periods of 
feeding and fasting. White adipose tissue is the major energy reserve in periods of excessive ^lergy 
30 use, and its res^e is ir»bilized during ^ergy deprivation. Understanding bow various molecules 
regulate adiposis and mergy balance in pbysiological and pathophysiological situations may lead to 
the development of novel therapeutics for buinan obesity. Adipose tissue is one of the primary target 
tissues for insulin. Adipogenesis and insulin resistance are linked in type n diabetes, non-insulin 
dependent diabetes mellitus (NJDDM). Most patients with n diabetes are obese and obesity, in 
35 turn, causes insulin resistance. Cytologically the conversion of a preadipocytes into mature 



12 



wo 2004/048550 



PCTAJS2003/038178 



adipocytes is characteaized by deposition of fat droplets around the nuclei. The conversion process 
in vivo can be induced by thiazolidinediones and other PPARy agonists (Adams et al. (1997) J Clin 
Invest 100:3149-3153) ^hicb also lead to increased sensitivity to insulin and reduced plasma ghicose 
and blood pressure. 

5 Pickup and Crook (1998; Diabetologia 41:1241-8) have suggested that NIDDM may result 

from the inability of an individual with hypersensitive acute-phase imnnmift response to carry out 
normal cell signaling and repair. Steps in this process are highly correlated with long-term lifestyle 
and environment and include: 1) high glucose stimulation of insulin and cytokine production, 2) 
influence of various cytokines on tissue remodeling during adipocyte differentiation and their affect 

10 on signaling pathways, and 3) occurrence of tissue damage when cytokines continue to be produced, 
extracdlular matrix concponents (ECM) are not recycled, andhonoeostasis is not timely restored. 
Maxsy cytokines and the rec^tors with which they intact are inqplicated in this process. These 
cytokines include tumor necrosis factor, connective tissue growth factor, transf onning growth feictor- 
beta, interleukin (IL)-13 and thdr receptors. Tumor necrosis factor contributes to insulin resistance 

15 by inhibiting insulin-stiniulated tyrosine phosphorylation of the in^^ This, in turn, 

prevCTts the insulin receptor from participating in normal signaling processes (Skohuk and 
Marcusohn (1996) Cytoldne Growth Factor Rev 7:161-173; Hotamisligil (1999) J Intern med 
245:621-625). Connective tissue growth factor mediates the buildup of mesenglial matrix (Murphy 
et aL (2000) J Biol Chem 274:5830-5834). Transforming growth factor-beta mediates the buildup of 

20 mesenglial matrix of the kidney and affects vascular function through its interaction with the inositol 
trisphosphate receptor, a key intracellular calcium channel (Sharma and McGowan (2000) Cytokine 
Growth Factor Rev 1 1:1 15-123). 

IL-13 and IL-4 are immuno-regulatory cytokines which share many ov^lapping biological 
properties. They both promote growth of B-cdls (McKenzie et aL (1993) Proc Natl Acad Sci 

25 90:3735-3739), induce expression of germ line Ce transcripts, and direct naive B cells to switch to 
the synthesis of IgE and IgG4 ( Punnomen et al. (1993) Proc Nafl Acad Sd 90:3730-3734). 
Similarly, different isoforms of the IL-13 and IL-4 receptors interact to form four t^es of IL-13 
receptor conqdexes. hi sonoe instances, IL-13 utilizes a receptor conydex composed of the IL-4 
rec^tor-a chain (Ra) and the IL-13Rol Although the specific role of each chain in IL-13 signaling 

30 is unclear, Ba/P3 cells transfected witii IL-13Ral display a nutogenic response to IL-13, but cdls 
transfected with mouse IL-13Ra2 do not In addition, a soluble IL-13Rc2/Fc fusion protean blocks 
the mitogenic response to IL-13 (Donaldson et al. (1998) J Immunol 161:2317-2324). This suggests 
that IL-13ROC2 could serve as a dominant negative inhibitor or decoy receptor for IL-13. However, 
in colonic carcinoma cell lines, the receptor corqplex displayed growth inhibition which was 

35 associated with tyrosine phosphorylation of ingnlin receptor substrate- 1 . It is evident that more 

13 



wo 2004/048550 



PCT/US2003/038178 



. research is needed to establish 1) which isofonns of the receptor coroplex promote cell growth and 
which inhibit cell growth and 2) whether this varies by cell or tissue type. 

The majority of research ra adipocyte hiology to date has hem done using transformed 
mouse preadipocyte ceU lines. The culture condition which stimulates mouse preadipocj^ 
5 differentiation is different from that for inducing human primary preadipoc}^ differentiatioiL In 
addition, primary cells are diploid and may therefore reflect the in vivo context better than aneuploid 
cell lines. Understanding the gene expression profile dunng adipogenesis in humans win lead to an 
understanding of the fundamental mechanism of adiposity regulation. Furthermore, through 
comparing the gene expression projSles of adipogenesis between donor with normal weight and 

10 donor with obesity, identification of crudal genes, potential drug targets for obesity and type n 
diabetes, will be possible. 

Thiazolidinediones (TZDs) act as agonists for the peroxisome-proli£^ator-activated receptor 
gamcoa (PPARy), ^ inemb^ of the nuclear hormone receptor superf amily. TZDs reduce 
hyperglycemia, hypednsulinemia, and hypertension, in part by promoting ghicose metabolism and 

15 inhibiting gluconeogenesis. Roles for PPARy and its agonists have been demonstrated in a wide 
range of pathological conditions including diabetes, obesity, hypertension, atherosclerosis, 
polycystic ovarian syndronoe, and cancers such as breast, prostate, liposarcoma, and colon canc^. 

The mechanism by which TZDs and other PPARy agonists enhance insulin sensitivity is not 
fully understood, but may involve Ihe ability of PPARy to promote adipogenesis. When ectopically 

20 expressed in cultured preadipoq^, PPARy is a potent inducer of adipocyte differentiation. TZDs, 
in combination with insulin and otiier factors, can also enhance differentiation of human 
preadipocytes in culture (Adams et al. (1997) J. Clin. Invest 100:3149-3153). The rdative potency 
of different IZDs in promoting adipogenesis in vitro is proportional to both thdr insulin sensitizing 
eEfects in vivo, and thdr abilit7 to bind and activate PPARy in vitro. Interestingly, adipocytes 

25 derived from omental adipose depots are refiractory to the effects of TZDs. It has therefore been 
suggested that tiie insulin sensitizing efifects of TZDs may result from thdr ability to pronoote 
adipogenesis in subcutaneous adipose depots (Adams et aL, supra). Further, dominant negative 
notations in the PPARy gene have been identified in two non-obese subjects with severe insulin 
resistance, hypert^ion, and overt non-insulin dependent diabetes noellitus (NTDDM) (Barroso et al. 

30 (1998) Nature 402:880-883). 

NIDDM is the most common form of diabetes melUtus, a chronic noetabolic disease that 
affects 143 million people worldwide. NIDDM is characterized by abnormal glucose and lipid 
metabolism that results from a combination of peripheral insulin resistance and defective insulin 
secretion. NIDDM has a complex, progressive etiology and a high degree of heritability. Numerous 

35 complications of diabetes including heart disease, strote, renal failure, retinopathy, and peripheral 



14 



wo 2004/048550 



PCT/US2003/038178 



neuropathy contribute to the high rate of morbidity and mortality. 

At the molecular level, PPARy functions as a ligand activated transcription factor. In the 
presence of ligand, PPARy forms a heterodiri^r with the retinoid X receptor (RXR) which then 
activates transcription of target genes containing one or more copies of a PPARy response elensnt 
5 (PPRE). Many g^es inqportant in lipid storage and metabolism contaiu PPREs and have been 
id^ilified as PPARy targets, inckiding PEPCK, aP2, LPL, ACS, and FAT-P (Auwers, J. (1999) 
Diabetologia 42:1033-1049). Multiple ligands for PPARy have been identified. These include a 
variety of fatty acid noetabolites; synthetic drugs bdongbg to the TZD class, such as Pioglitazone 
and Rosiglitazone (BRL496S3); and certain non-glitazone Qnrosine analogs such as GI262570 and 

10 GW1929. The prostaglandin d^vative 15-dPGJ2 is a potent endogenous ligand for PPARy. 

Expression of PPARy is very high in adipose but barely detectable in skeletal muscle, the 
piimaiy site for insulin stinoDolated glucose disposal in the body. PPARy is also nooderately 
expressed in large intestine, Iddni^, liver, vascular smooth nooiscle, hematopoietic cells, and 
macrophages. The high expression of PPARy in adipose tissue suggests that the insulin sensitizing 

15 effects of TZDs may result from alterations in the expression of one or noore PPARy regulated genes 
in adipose tissue. Identification of PPARy target g^nes will contribute to better drug design and the 
developmeot of novel th^apeutic strate^es for diabetes, obesity, and other conditions. 

Systematic atteiiq)ts to id^itifjr PPARy target genes have been made in sev^*al rodent 
models of obesity and diabetes (SuzuM et al. (2000) Jpn. J. PharmacoL 84:113-123; Way et aL 

20 (2001) Endocrinology 142:1269-1277). However, a serious drawback of the rodent gene expression ^ 
studies is that significant differmces exist between human and rodent models of adipogenesis, 
diabetes, and obesity (Taylor (1999) Cell 97:9-12; Gregoire et al. (1998) Physiol. Reviews 78:783- 
809). Ihmfore, an unbiased approach to idratiiying TZD regulated genes in primary cultures of 
human tissues is necessary to fully elucidate the molecular basis for diseases associated with PPARy 

25 activity. 

Ovarian Cancer 

Ovarian canc^ is the leading cause of death firom a gynecologic cancer. The majority of 
ovarian cajocers are derived from epithelial cells, and 70% of patients with epithdial ovarian cancers 
present with late-stage disease. As a result, the long-tenn survival rate for this disease is very low. 
30 Identification of early-stage markers for ovarian caoc&c would significantiy increase the survival rate. 
Genetic variations involved in ovarian cancer development include mutation of p53 and 
microsatellite instability. Gene expression patterns liloely vary wbsai normal ovary is coicpared to 
ovarian tunoors. 
Promonocytes 

35 Leukocytes con^rise lynophocytes, granoulocytes, ^ Lyn^hoc^^ include T- 



15 



wo 2004/048550 



PCTAJS2003/038178 



and B-cdls, which si)ecifically recognize and respond to foreign pathogens. T-cdls fight viral 
infections and activate otti^ leukocytes, while B-cells secrete antibodies that neutralize bacteria and 
other microbes. Granulocytes and monocytes are primarily migratory, phagocytic cells that exit the 
bloodstream to fight infection in tissues. Monocytes, vAnch are derived from immature 
5 promonocytes, further differentiate into macrophages that engulf and digest ntncroorganisms and 
damaged or dead cells. Monocytes and macrophages modulate the inomune response by secreting 
signaling molecules such as growth factors and cytokines. Tunoor necrosis factor-a (TNF-a), for 
example, is a macrophage^secreted protein with anti-tumor and anti-viral activity. In addition, 
monocytes and macrophages are recruited to sites of infection and inflammation by signaling 

10 proteins secreted by other leukocytes. The differentiation of the monocyte blood cell lineage can be 
stiuMed in vitro using cultured cell lines. For example, THP-1 is a human promonocryte cell line that 
can be activated by treatment with both phorbol ester such as phorbol m^xistate acetate (PMA), and 
lipopolysaccharide (LPS). PMA is a broad activator of the protdn kinase C-dependent pathways. 
Monocytes are involved in the initiation and maintmance of inflammatory iTnmiiiie 
. 15 responses. The outer mmibrane of grant-negative bacteria expresses lipopolysaccharide (LPS) 

con^lexes called endotoxins. Toxidty is associated with the lipid con^onent (Lipid A) of LPS, and 
imnsmogenicity is assodated with the polysaccharide conqx>nents of LPS. LPS elicits a varied of 
inflammatory responses, and because it activates con^lCToent by the alternative (properdin) pathway, 
it is often part of the pathology of gramruegative bacterial infections. For the most part, endotoxins 

20 remain associated with the cell wall until the bacteria disintegrate. LPS released into the 

bloodstreamby lysing granob-negative bacteria is first boimd by certain plasma proteins id^tified as 
LPS-binding protdns. The LPS-binding protdba complex interacts with CD14 receptors on 
monocytes^ macrophages, B cells, and other types of receptors on endothelial cells. Activation of 
human B cells with LPS results in ndtogenesis as weU as inmnunoglobulins Inmonocytes 

25 and macrophages three types of events are triggered during thdr interaction with LPS : 1) production 
of cq^kines, including IL-1, IL-6, IL-8, TNF-a , and platelet-activating factor, which stimulate 
production of prostaglandins and leukotrienes that mediate inflammation and septic shock; 2) 
activation of the con^lement cascade; and 3) activation of the coagulation cascade. 
Prostate cancer 

30 Prostate cancor is a common malignancy in men over the age of 50, and tiie incidmce increases with 
age. In the US, there are approximately 132,000 newly diagnosed cases of prostate cancer and more 
than 33,000 deatiis firom the disorder each year. 

Oncecancer cells arise in the prostate, they are stinmilated by testosterone to a more rapid 
growth. Thus, removal of the testes can indirectly reduce both rapid growth and n:2etastasis of the 

35 canc^. Over 95 percent of prostatic cancers are adenncaminomas \^ich ori ginate tn thft pmjst^tin 



16 



wo 2004/048550 



PCT/US2003/038178 



aciiii. The remainiDg 5 percent are divided between sqaaiDons ceQ and transitional cell carcmomas, 
both of v^ch arise in the prostatic ducts or oth^ parts of flie prostate gland. 

As with most tumors, prostate canc^ develops through a multistage progression ultimately 
resulting in an aggressive tumor phenotype. The initial step in tumor progression involves the 

5 hypeiproliferationof normal luminal and/or basal q^itheH^ Androgen responsive cdls 
become hyperplastic and evolve into eady-stage tunoors. Althougji early-stage tumors are often 
androg^ sensitive and respond to androgen ablation, a population of androgen indepmdent cdls 
evolve firom the hypecplastic population. These cells represent a more advanced form of prostate 
tumor that may beconoe invasive and potentially beconoie metastatic to the bone, brain, or lung. A 

10 variety of graes may be diffi^entially expressed during tumor progression. For exanqile, loss of 

heterozygosity (LOH) is frequeutly observed on chromosome 8p in prostate cancer. Fluorescence in 
situ hybridization (FISH) revealed a deletion for at least 1 locus on 8p in 29 (69%) tumors, with a 
sigmficantiy higher frequency of die deletion on 8p21.2-p21.1 in advanced prostate cancer than in 
localized prostate cancer, in^lying that ddetions on 8p22-p21 .3 play an inoportant role in tumor 

IS differentiation, while 8p21.2-p21.1 deletion plays a role in progression of prostate cancer (Oba, K. et 
al. (2001) Canc^ Genet Cytogenet. 124: 20-26). 

A primary diagnostic mark^ for prostate cancer is prostate specific antigen (PSA). PSA is a 
tissue-specific serine protease almost exclusively produced by prostatic epithelial cells. The quantity 
of PSA correlates with the nuniber and volume of Ifae prostatic epithelial cells, and consequently, the 

20 levels of PSA are an excdlent indicator of abnormal prostate growth. Men with prostate cancer 
exhibit an early Unear increase in PSA levels followed by an exponential increase prior to diagnosis. 
However, since PSA levels are also infturaced by factors such as inflammation, androgm and otiier 
growth factors, some sdratists maintain that changes in PSA levds are not useful in detecting 
individual cases of prostate cancer. 

25 CuTTCTt areas of cancer researdi provide additional prospects for naarkers as weU as potential 

therapeutic targets for prostate cancer. Several growth factors have been shown to play a critical role 
in tumor development, growth, and progression. The growth factors Epidermal Growth Factor 
(EGF), Fibroblast Growth Factor (FGF), and Tunoor Growth Factor alpha (TGFa) are mq>ortant in 
the growth of normal as well as hyperproliferative prostate epithelial cdls, particularly at early 

30 stages of tumor development and progression, and affisct signaling pathways in these cdls in various 
ways (Lin, J. et aL (1999) Cancer Res. 59:2891-2897; Putz, T. et aL (1999) Cancer Res. 59:227-233)- 
The TGF-P family of growth factors are geoiarally expressed at increased levels inhuman cancers 
and the high expression levels in msasy cases correlates with advanced stages of malignancy and poor 
survival (Gold, L.I. (1999) Crit Rev. Qncog. 10:303-360). Finally, there are hunoancdl lines 

35 representing both the androgen-d^endent stage of prostate cancer (LNCap) as well as the androgen- 

17 



wo 2004/048550 PCTAJS2003/038178 

independCTt, hormone refiractory stage of the disease (PCS and DU-145) that have proved useftd in 
studying gene expression patterns associated with the progression of prostate cancer, and the effects 
of cell treatments on these expressed genes (Chung, T.D. (1999) Prostate 15:199-207). 

5 There is a need in the art for new con^ositions, including nucleic acids and proteins, for the 

diagnosis, prevention, and treatment of inmnme system, neurological, developmraital, muscle, cell 
proliferative disorders, and disorders of lipid metabolism 

SUMMARY OF THE INVENTION 

10 Various enS>odirDents of the invention provide purified polypeptides, immnTift response 

associated proteins, referred to collectively as 'TRAP* and individually as *IRAP-1,' *IRAP-2,' 
•IRAP-3,' •IRAP-4,* 'IRAP-5,' 'IRAP-6,' 'IRAP-?,* 'IRAP-S,' •IRAP-9,' •IRAP-IO,* *IRAP-11,' 
'IRAP-12,' 'IRAP-13,' 'IRAP-14,' 'JRAP-IS,* 'IRAP-ie,' *IRAP-17,* 'IRAP-IB,* •IRAP-19,' 
'IRAP.20,' •IRAP-21,' 'IRAP-22,* *IRAP.23,' 'IRAP-24,' *IRAP-25,' •IRAP-26,' 'IRAP-27,' 

15 •IRAP.28,* 'IRAP-29,' 'IRAP-SO,' *IRAP-31,' and *IRAP-32' and mefliods for usmg these proteins 
and their ^coding polynucleotides for the detection, diagnosis, and treatm^ of diseases and 
medical conditions. Endbodiments also provide methods for utilizing the purified iTynmitift response 
associated proteins and/or their encoding polynucleotides for facilitating the drug discovery process, 
including determination of efficacy, dosage, toxidily, and pharmacology. Related eanbodiments 

20 provide methods for utilizing the purified immune response associated proteins and/or thdr encoding 
polynucleotides for investigating the pathogenesis of diseases and medical conditions. 

An embodiment provides an isolated polypeptide selected from the group consisting of a) a 
polypeptide convulsing an amino acid sequence selected firomthe group consisting of SEQ ID NO:l- 
32, b) a polypeptide conoprising a naturally occurring amino acid sequence at least 90% identical or 

25 at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-32, c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-32, and d) an immunogenic fragment of a polypeptide 
having an anuno acid sequence selected fromtbe group consisting of SEQ ID NO: 1-32. Another 
embodinoent provides an isolated polypeptide comprising an fltninn acid sequence of SEQ ID 

30 NO:l-32. 

Still another enobodinrait provides an isolated polynucleotide encoding a polypq>tide 
selected fromtbe group consisting of a) a polypeptide comprising an amino acid sequrace selected 
from the group consisting of SEQ ID NO:l-32, b) a polypeptide comprising a naturally occuning 
amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence 
35 selected fixMnthe group consisting of SEQ ID NO:l-32, c) a biologically active fi:agment of a 



18 



wo 2004/048550 



PCT/US2003/038178 



polypeptide having an annno acid seqaence sdected from the group consisting of SEQ ID NO:l-32, 
andd) aninonniDOgemcfragnientof apolypqjtidehav^ 

group consisting of SEQ ID NO:l-32. In anoth^ enibodinaent, the polynudeotide en^^ 
poljrpeptide selected from the group consisting of SEQ ID NO:l-32. In an alternative embodiment, 
5 the polynucleotide is selected from the group consisting of SEQ ID NO:33-64. 

Still another embodiment provides a xecomhinant polynucleotide conqprising a promoter 
sequence operably linked to a polynucleotide encoding a polyp^tide selected from the group 
consisting of a) a polypeptide con^rising an ammo add sequence selected from the group consisting 
of SEQ ID NO: 1-32, b) a polypeptide conq)iising a naturally occurring ansno add sequence at least 

10 90% identical or at least about 90% identical to an annno add sequrace selected from the group 
consisting of SEQ ID NO:l-32, c) a biologically active fragment of a polypeptide having an amino 
add sequence sdected from the group consisting of SEQ ID NO:l-32, and d) an inmninogenic 
fr*agment of a polypeptide having an anuno add sequence selected from the group consisting of SEQ 
ID NO:l-32. Another enibodinrait provides a cell transformed with the recombinant polynucleotide. 

15 Yet another embodim^ provides a transgenic organism con^nising the leconobinant polynucleotide. 

Another enoibodini^ provides a method for producing a polypeptide selected from the group . 
consisting of a) a polypeptide conopiising an amino acid sequence selected from the group consisting 
of SEQ ID NO: 1-32, b) a polypeptide conoprising a naturally occurring amino add sequence at least 
90% identical or at least about 90% idmtical to an amino acid sequence selected from the group 

20 consisting of SEQ ID NO:l-32, c) a biologically active fragment of a polypeptide having an amino 
add sequence selected from the group consisting of SEQ ID NO: 1-32, and d) an immunogenic 
fragment of a polypeptide having an ammo acid sequ^ice selected from the group consisting of SEQ 
ID NO:l-32. The method comprises a) cultunng a cell under conditions suitable for expression of 
the polypeptide, whmin said cdll is transformed witii a recombinant polynucleotide con^rising a 

25 promoter sequence operably linked to a polynucleotide ^coding the polypeptide, and b) recovering 
the polypeptide so ^pressed. 

Yet another mibodimBnt provides an isolated antibody whidi specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide conq>rising an amino add 
sequence sdected fromihe group consisting of SEQ ID NO:l-32, b) a polypeptide con;>rising a 

30 naturally occurring anmno add sequence at least 90% identical or at least about 90% identical to an 
amino add sequence selected from the group consisting of SEQ ID NO:l-32, c) a biologically active 
fragment of a polypeptide having an amino add sequmce selected from the group consisting of SEQ 
ID NO:l-32» and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-32. 

35 Stin yet another enibodiment provides an isolated polynucleotide selected from the group 

19 



wo 2004/048550 



PCTAJS2003/038178 



consistmg of a) a polynucleotide coDoprising a polynucleotide seqaeace selected firom the group 
consisting of SEQ ID NO:33-64, b) a polynacleotide concprising a naturally occurring polynucleotide 
sequmce at least 90% id^itical or at least about 90% identical to a ]>o^ucleotide sequence selected 
from the group consisting of SEQ ID NO:33-64, c) a polynacleotide cowplewsABiy to the 
5 polynucleotide of a), d) a polynucleotide conaplemsotary to the polynucleotide of b), and e) an RNA 
equivalent of a)-d). In other eodbodunents, the polynucleotide can conoprise at least about 20> 30, 40, 
60, 80, or 100 contiguous nucleotides. 

Yet another mjbodinaent provides a method for detecting a target polynacleotide in a 
sample, said target polynacleotide bdng selected from the group consisting of a) a polynucleotide 

10 con^rising a pol3amcleotide sequraice selected from the group ccxosisting of SEQ ID NO:33-64, b) a 
polynucleotide con^rising a naturaUy occurring polynucleotide sequence, at least 90% identical or at 
least about 90% id^itical to a polynucleotide sequCTce selected from the group consisting of SEQ ID 
NO:33-64, c) a polynucleotide conqilenaentary to the polynucleotide of a), d) a polynacleotide 
conq)lementary to the polynucleotide of b), and e) an RNA equivalent of a>d). The method 

15 coxcprises a) hybridizing the sanQ>le \vith a probe conqpiising at least 20 contiguous nucleotides 
conq)iising a sequence CQnq)lementary to said target polyxmcleotide in the sample, and ^iiich probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization 
cojooplex is formed between said probe and said target polynucleotide or fragments thereof, and b) 
detecting the presence or absence of said hybridization complex. In a related embodiment, the 

20 inethod can include detecting the amount of the hybridization con^les. In stQl other embodiments, 
the probe can comprise at least about 20, 30» 40, 60, 80, or 100 contiguous nucleotides. 

Still yet another endbodiment provides a method for detecting a target polynacleotide in a 
sample, said target polynacleotide bdng selected from the group consisting of a) a polynucleotide 
con^rising a polynucleotide sequence sdected from the group consisting of SEQ ID NO:33-64, b) a 

. 25 polynucleotide conq>rising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:33-64, c) a polynucleotide complmientary to the polynucleotide of a), d) a polynucleotide 
conDpl^nentary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 
comprises a) amplif^g said target polynucleotide or fragment thereof using polymerase chain 

30 reaction anqilification, and b) detecting the presence or absence of said amplified target 

polynucleotide or fragment thereof. In a related embodinoent, the method can include detecting die 
amount of the anq)lified target polynucleotide or fragment thereof. 

Another CTribodimsnt provides a coniposition comprising an effective amount of a 
polypeptide selected from the group consisting of a) a polypeptide coic^xising an amino add 

35 sequence selected from the group consisting of SEQ ID NO:l-32, b) a polypeptide con^prising a 

20 



wo 2004/048550 



PCT/US2003/038178 



naturally ocGxuiing amino acid sequence at least 90% identical or at least about 90% identical to an 
ancdno add sequence selected from the group consisting of SEQ ID NO:l-32, c) a biologically active 
fragment of a polypeptide having an anmno add sequence selected from the group consisting of SEQ 
ID NO:l-32, and d) an immunogemc fragment of a polypeptide having an amino add sequence 

5 sdected from the group consisting of SEQ ID NO:l-32, and a plLarmaceutically acceptable excipient 
In one embodiment, the con^sition can comprise an anmno acid sequence selected firomthe group 
consisting of SEQ ID NO:l-32. Other embodiments provide a method of treating a disease or 
condition associated with decreased or abnormal expression of functional IRAP, conqprising 
administ^ing to a patient in need of sudi treatment the concqiosition. 

10 Yet another CTdbodiment provides a method for screening a compound for effectiveness as 

an agonist of a polypeptide selected fiomthe group consisting of a) a polyp^tide conqprising an 
ammo acid sequraice selected from the group consisting of SEQ ED NO:l-32, b) a polypeptide 
conqprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% 
identical to an annuo add sequence selected from the group consisting of SEQ ID NO:l-32, c) a 

15 biologically active fragment of a polypeptide haviiig an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-32, and d) an immunogenic fragment of a polypeptide having an anoono 
add sequence selected from the group consisting of SEQ ID NO:l-32. The method comprises a) 
contacting a samqple comprising the polypeptide with a conipound, and b) detecting agonist activity 
in the sanq>le. Anofh^ embodiment provides a composition conqnising an agonist confound 

20 identified by the method and a pharmaceutically acceptable excipient Yet another embodiment 
provides a method of treating a disease or condition assodated with decreased expression of 
functional IRAP, comprising administering to a patient in need of such treatment the conqiosition. 

Still yet another embodino^ provides a method for screening a compound for efiEectivmess 
as an antagonist of a polypeptide selected firomthe group consisting of a) a polypeptide conoprising 

25 an amino acid sequence selected firomthe group consisting of SEQ ID NO: 1-32, b) a polypeptide 
concprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% 
identical to an amino acid sequence selected firomthe group consisting of SEQ ID NO: 1-32, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected firomthe group 
consisting of SEQ ID NO: 1-32, and d) an immunogenic fragment of a polypeptide having an amino 

30 acid sequence selected firomthe group consisting of SEQ ID NO:l-32. The noethod coiqprises a) 
contacting a sair^le comprising the polypeptide with a confound, and b) detecting antagonist 
activity in the sancple. Another embodiment provides a composition con5)rising an antagonist 
compound identified by the method and a pharmaceutically acceptable excipient. Yet another 
enibodiment provides a method of treating a disease or condition associated with overexpression of 

35 functional IRAP, conqirising adnoinistering to a patient in need of such treatment the coni^sition. 

21 



wo 2004/048550 



PCT/US2003/038178 



ABothear embodiniBiit provides a mefliod of screening for a concpouiid fliat specifically binds 
•to a polypeptide selected from Hie group consisttog of a) a polypeptide con^risiiig an anuno acid 
seqaence sdected from the group consisting of SEQ ID NO:l-32, b) a polypeptide con^iising a 
naturally occurring amino add sequence at least 90% identical or at least about 90% identical to an 

5 anoino acid sequ^ice selected from the group consisting of SEQ ID NO:l-32, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:l-32, and d) an immonogenic fragmeot of a polypeptide having an anoino acid sequence 
selected from the group consisting of SEQ ID NO:l-32. The method comprises a) combining the 
polypeptide with at least one test conaqpomid xmder suitable conditions, and b) detecting binding of 

10 the polypeptide to the test compound, thereby identifying a conopoimd that specifically binds to the 
polypeptide. 

Yet another embodiment provides a noefliod of screening for a compound that modulates the 
activity of a polyp^tide selected from the group consisting of a) a polypeptide conopiising an amino 
acid sequence selected from ttie group consisting of SEQ ID NO:l-32, b) a polypeptide comprising a 

15 naturally occurring anmno acid sequence at least 90% identical or at least about 90% identical to an 
anuno acid sequence selected from the group consisting of SEQ ID NO:l-32, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-32, and d) an immunogenic fragmrat of a polypeptide having an saxino acid sequence 
selected from the group consisting of SEQ ID NO: 1-32. Hie method conq>rises a) conibining the 

20 polypeptide with at least one test confound under conditions permissive for tiie activity of the 
' i>olypeptide, b) assessing tbe activity of the polypeptide in the presence of the test compound, and c) 
comparing the activity of the polypeptide in the presence of the test compoimd with the activity of 
the polypeptide in the absence of the test compound, wherein a change in the activity of the 
polypeptide in the presence of die test compound is indicative of a compound that modulates the 

25 activity of the polypeptide. 

Still yet another embodiment provides a method for screening a compound for effectiveness 
iu altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polyuucleotide sequence selected fit>mthe group consisting of SEQ ID NO:33-64, the method 
comprising a) contacting a sanq>le con^rising the target polynucleotide with a conopound, b) 

30 detecting altered expression of the target polynucleotide, and c) con5)aring the expression of the 
target polynucleotide in the presence of varying amounts of the coiqpound and in the absmce of the 
cowpoxmd. 

Another embodiment provides a noethod for assessing toxicity of a test conopound, said 
method conoprising a) treating a biological sample containing nucleic acids with the test confound; 
35 b) hybridizing the nucleic acids of the treated biological sancple with a probe comprising at least 20 

22 



wo 2004/048550 



PCT/US2003/038178 



coDtiguotis nacleotides of a polynacleotLde selected firomfhe group consisliiig of i) a polynucleotide 
cQDopiising a polynucleotide sequence selected fromtlie group consisting of SEQ ID NO:33-64, ii) a 
polynucleotide co]iq)risiDg a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% idratical to a potyoncleotide sequence selected firom the group consisting of SEQ ID 
5 NO:33-64, iii) a polynucleotide liaving a sequence con^lementary to i), iv) a polynucleotide 
coix;)l«tieDtary to the polynucleotide of ii), and v) an RNA equivaleot of i)-iv). Hybridization 
occurs under conditions whereby a specific hybridization complex is formed 1)etween said probe and 
a target polynucleotide in the biological sample, said target polynucleotide selected from the group 
consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group 

10 consisting of SEQ ID NO:33-64, ii) a polynucleotide conoqprising a naturally occurring 

polynucleotide sequence at least 90% identical or at least about 90% identical to a polynucleotide 
sequence selected from the group consisting of SEQ ID NO:33-64, iii) a polynucleotide 
conq)lexi]entaiy to die polynucleotide of i), iv) a polynucleotide complementary to the 
polynucleotide of ii), and v) an RNA equivalent of i)-iv). Alternatively, the target polynucleotide 

15 can comprise a fi^gment of a polynucleotide selected from the group consisting of i)-v) above; c) 
quantifying the amount of hybridization cornplex; and d) coniparing the amonnt of hybridization 
complex ia the treated biological sanq>le with the amount of hybridization conq>lex in an imtreated 
biological sample, wherein a difference in the amount of hybridization complex in the treated 
biological sanq>le is indicative of toxicity of the test compound. 

20 

BRIEF DESCRIPTION OF THE TABLES 
Table 1 summarizes the nomenclature for full length polynucleotide and polypeptide 
embodiments of the invention. 

Table 2 shows the GenBank identification number and annotation of the nearest GenBank 
25 homolog, and the PROTEOME database identification numbers and annotations of PROTEOME 
database homologs, for polypeptide mibodiments of the invention. The probability scores for the 
matches between each polypeptide and its homolog(s) are also shown. 

Table 3 shows structural features of polypeptide embodiments, including predicted motifs 
and domains, along with the methods, algorithms, and searchable databases used for analysis of the 
30 polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assenible 
polynucleotide embodiments, along with selected fragments of the polynucleotides. 

Table 5 shows represraitative cDNA libraries for polynucleotide eoibodiments. 
Table 6 provides an appendix which describes the tissues and vectors used for construction 
35 of the cDNA libraries shown in Table 5. 

23 



wo 2004/048550 PCTAJS2003/038178 

Table 7 shows the tools, programs, and algoiithns used to analj^e polynucleolides and 
polypeptides, along with applicable descriptions, references, and threshold paranoeters. 

Tahle 8 shows smgle nacleotide polymorphisnis found in polynucleotide sequences of the 
invention, along with allele firequencies in different human populations. 

5 

DESCRIPTION OF THE INVENTION 

Before the pres^ protdns, nucleic acids, and noetiiods are described, it is und^tood that 
einhn dimgnts of the invention are not limitBd to the particular machines, instruments, materials, and 
methods described, as these may vary. It is also to be understood ttiat the terminology used hmin is 
10 for the purpose of describing particular CTobodiments only, and is not intended to limit the scope of 
the invCTtion. 

As used herein and in the app^ided claims, the singular forms "a," "an," and "the" include 
plural reference unless the context clearly dictates otherwise. Thus, for example, a refia^ce to "a 
host cell" includes a pluralily of such host ceQs, and a reference to "an antibody" is a refer^ice to 

15 cxoe or more antibodies and equivalents thereof known to those skilled in the art, and so forth. 

Uiiless defined oth^wise, all technical and scientific terms used herein have the same 
mftatitTigg as cornmonlyimd^stoodby one of ordinary sldn in the art to wMch this inventic^ 
belongs. Although any machines, materials, and methods similar or equival^ to those described 
herdn can be used to practice or test the present invention, the preferred machines, materials and 

20 methods are now described. AH publications mentioned herdn are cited for the purpose of 

desoribing and disclosing the cdl lines, protocols, reagents and vectors which are reported in the 
publications and which might be used in connection with various embodimiaits of the invention. 
Nothing herein is to be construed as an adrmssion that the inv^xtion is not entitled to antedate such 
disclosure by virtue of prior invention. 

25 DEFINITIONS 

"TRAP" refers to the anino add sequences of substantially purified IRAP obtained £ram any 
species, particularly' a mammalian species, including bovine, ovine, pordne, nBuine, equine, and 
human, and firom an^ source, whether natural, syothetic, semi-synthetic, or recombinant 

The term "agonist" refers to a UGKdecule vMch intensifies or nnmics the biological activity of 
30 IRAP. Agonists may include proteins, nucldc acids, carbohydrates, small noolecules, or aixy other 
conq)Ound or conq)osition whidi modulates the activity of IRAP either by directiy interacting with 
TRAP or by actiag on con^Kxnents of the biological pathway in which IRAP participates. 

An "alldic variant" is an alternative form of the gene encoding IRAP. Allelic variants may 
result from at least one mutation ui the nucleic acid sequence and may result in altered mRNAs or in 
35 polypeptides whose structure or function nmy or may not be altered. A gene may have none, one, or 



24 



wo 2004/048550 PCT/US2003/038178 

iDany alldic variants of its naturally occmxing form. Caromon mutatioiial changes which give rise to 
allelic variants are gra^QiaQy ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of diese types of changes may occur alone, or in corribination with the others, oneornoore 
times in a given sequ^oce* 

"Alt^^d** nucleic add sequences ^coding IRAP include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as TRAP or a 
polypeptide wifli at least one functional characteristic of IRAP. Inclnded within this definition are 
polymorphisms whicih may or may not be readily detectable using a particular oligonucleotide probe 
of the polynucleotide encoding IRAP, and in5>roper or nnexpected hybridization to alldic variants, 
TOfh a locus other than the normal chromosomal locus for the polynucleotide encoding IRAP. The 
encoded protdn may also be "altered," and may contain ddetions, insertions, or substitutions of 
amino add residues produce a sil^ diange and result in a fonctionally equivalrat IRAP. 
Ddiber ate annno add substitutions may he made on the basis of one or more sinmlarities in polarity, 
diarge, solubility, hydrophobidty, hydnqphilidty, and/or the amphipathic nature of the residues, as 
long as the biological or inmoinological activity of IRAP is retained. For exan^ile, negatively 
diarged amino adds mxy include aspartic acid and glutamic acid, and positively diarged amino 
acids may include lysine and arginine. Amino adds with undiarged polar side chains Imving similar 
hydrophilidty values n^y include: asparagine and ghitamine; and serine and threonme. Amino adds 
with uncharged side chains having similar hydrophilidty values noay include: leucine, isoleucine, 
and valine; glycine and alanine; and phen^alanine and tyrosine. 

The tenns ''ansno add" and "amino acid sequCTce" can x&fcr to an oligopeptide, a p^tide, a 
polypeptide, or a protdn sequence, or a fragment of any of these, and to naturally occurring or 
synthetic molecules. Where "amino acid sequence" is recited to refer to a sequ^ce of a naturally 
occurring protdn molecule, "a3Dmno add sequence" and like terms are not meant to linit the amino 
add sequence to the con^lete native amino add sequrace associated with die recited protdn 
molecule. 

"An5)lification" relates to the production of additional copies of a nucleic acid. 
An^lification may be carried out using polymerase chain reaction (PGR) tedmologies or other 
nucleic acid asqplification technolo^es well known 'm the art 

The term "antagonist" refers to a molecule which inhibits or att^niates the biological 
activity of IRAP. Antagonists may include proteins such as antibodies, anticalins, nucldc adds, 
carbohydrates, small molecules, or any other concpound or composition vMdi modulates the activity 
of IRAP dfherby directly inteacting with IRAP or by acting oncon^ponents of the biological 
pathway in ^^ch IRAP partidpates. 

Hie term "antibody** rrfers to intact immunoglobulin molecules as well as to firagments 



25 



wo 2004/048550 PCT/US2003/038178 

thereof, such as Fab, F(ab')2, and Fv fragmmts, which are capable of binding an epitopic 
"deter min aiit. Antibodies that bind IRAP polypeptides can be prepared using intact polj^eptides or 
using fragments containing small peptides of interest as the inmmnizing anligen. The polypeptide or 
oligopeptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be d^ved from the 
5 translation of RNA, or synthesized chemically, and can be conjugated to a carder protean if desired. 
Conmonly used carriers that are chemically coupled to peptides include bovine serum albumin, 
thyroglobulin, and loe^ole linnet lienK>cyanin (KLH). The coupled peptide is then used to 
immunize the animal. 

The term "antigenic determinant'" refers to that region of a molecule (i.e., an epitope) that 

10 makes contact with a particular antibo(ty. Whenaproteinor a fragment of a protein is used to 
immunize a host animal, numerous regicms of die protein may induce the production of antibodies 
which bind specifically to antigenic determinants (particular regions or fliree-dimensional structures 
on the protein). An antigenic determinant may conq)ete with the intact antigen (i.e. , the immunog^ 
used to elicit the immune response) for binding to an antibody. 

15 The term "aptamer^* refers to a nucleic acid or oligonucleotide molecule that binds to a 

specific molecular target ^tanaers are derived from an in vitro evolutionary process (e-g., SELEX 
(Systematic Evolution of ligands by Exponential Enricihmsnt), described in U.S. Patent No. 
5,270,163), which selects for target-spedfic aptamer sequences fromlarge combinatorial libraries. 
Aptamer coirpositions may be double-stranded or single-stranded, and may include 

20 deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. 

The nucleotide conoponents of an aptamer may have modified sugar groups (e.g., the 2'-OH group of 
a ribonucleotide may be replaced by 2 -F or T-NH^, which may improve a desired property, e.g., 
resistance to nucleases^ or loiiger lifetime in blood, ^tamers may be conjugated to othi^ molecules, 
e.g., a higjh molecular weight carrier to slow clearance of die aptamer fit>mthe circulatory system. 

25 ^tamers may be specifically cross-linked to thdr cognate ligands, e.g. , by photo-activation of a 
cross-linker ^rody, E.N. and L. Gold (2000) J. Biotechnol. 74:5-13). 

The term "intranoa:** refi^ to an aptami^ which is expressed m vivo. For example, a 
vaccinia virus-based RNA expression system has been used to express specific RNA aptamrars at 
high levds in the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. Nafl. Acad. Sci. USA 

30 96:3606-3610). 

The term "spiegelmer" refers to an aptamer which includes L-DNA, L-RNA, or other left- 
handed nucleotide derivatives or nudeotide-like noolecules. Aptamers containing left-handed 
nucleotides are resistant to degradation by naturally occurring enzymes, ^v^ch. normally act on 
substrates containing ri^t-handed nucleotides. 
35 The term *'antisense'' refers to any coiiq)osition capable of base-pairing with the "sense** 



26 



wo 2004/048550 



PCT/US2003/038178 



(coding) strand of a polynucleotide liaving a specific nucldc acid sequence. Antisense conopositians 
may include DNA; RNA; peptide nucleic add (PNA); oligonucleotides having modified backbone 
linkages such as phosphorothioates, methylphosphonates, orbenzylphosphonates; oligonucleotides 
having modified sugar groups such as 2 -methoxyethyl sugars or 2'~me£hoxyethoxy sugars; or 
5 oligonucleotides having modified bases such as S-n^yl cytosine, 2*-deoxyuracil, or 7-dea2a-2 - 
deoxyguanosine. Antisense molecules may be produced by any method induding ch^cal syn&esis 
or transcription. Once introduced into a cell, flie conqplmientaiy antisense molecule base-pairs with 
a naturaHy occurring nucleic acid sequence produced by the cell to form duplexes which block either 
transcription or translation. The designation "negative** or "nmnus" can refer to the antisense strand, 

10 and the designation "positive" or "phis" can refer to the sense strand of a reference DNA molecule. 
The term ''biologically active" refers to a protein having structural, regulatory, or 
biochenoical functions of a naturally occurring noolecule. Likewise, "immunologically active" or 
"immunogenic" refers to the capability of the natural, reconihiTiant, or synthetic IRAP, or of any 
oligopeptide thereof, to induce a specific immune response in appropriate aniinaiQ or cells and to 

15 bind with specific antibodies. 

"Con[5)lemmtary" describes the rdationship between two single-stranded nucldic add 
sequences that anneal by base-pairing. For example, 5 -AGT-3' pairs with its conq>lement, 
3*-TCA-5', 

A "composition comprising a ^ven polynucleotide" and a "composition conopiising a given 

20 polypeptide" can refer to any composition containing the givrai polynucleotide or polypeptide. The 
conoposition may conq>rise a dry formulation or an aqueous solution. Conqiositions comprising 
polynucleotides encoding IRAP or firagments of IRAP noay be employed as hybridization probes. 
The probes may be stored in freeze-dried form and may be associated with a stabilizing agent such as 
a carbohydrate. In hybridizations, the probe noay be deployed in an aqueous solution containing salts 

25 (e.g., NaQ), detergents (e.g., sodium dodeqrl sulfate; SDS), and other components (e.g., Denhaidt's 
solution, dry milk, sahnom spenn DNA, etc.). 

"Consensus sequence" ref^s to a nucleic acid sequence which has been subjected to 
repeated DNA sequence analysis to resolve uncalled bases, extended using the Xly-PCR kit (Applied 
Biosystems, Foster City CA) in the 5* and/or the 3' direction, and resequenced, or which has been 

30 assendbled fi:om one or more ov^lapping cDNA, EST, or genomic DNA fragments using a computer 
program for fragment assenibly, such as the GELVIEW fra gment assercbly system (Acceliys, 
Burlington MA) or Phrap (University of Washington, Seattle WA). Some sequences have been both 
extended and assernbled to produce the consensus sequence. 

"Conservative amino add substitutions" are those substitutions that are predicted to least 

35 interfere with flie properties of the original protdn, Le., the structure and especially the function of 

27 



wo 2004/048550 PCT/US2003/038178 

the proton is conserved and not significantly changed by such substitutions. The table below shows 
ammo acids which may be substituted for an origmal amino acid in a protein and which are regarded 
as conservative amino acid substitutions. 





Original Residue 


Conservative Substitution 


5 


Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 




Asp 


Asn, Glu 




Cys 


Ala, Ser 


10 


Ghi 


Asn, Ghi, His 




Glu 


Asp, Gin, His 




Gly 


Ala 




His 


Asn, Arg, Gin, Glu 




ne 


Leu, Val 


15 


Leu 


ne, Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, ne 




Phe 


His, Met, Leu, Tip, Tyr 




Ser 


Cys, Thr 


20 


Thr 


Ser, Val 




Tip 


Phe, Tyr 




Tyr 


His, Phe, Trp 




Val 


ne. Leu, Tbr 



25 Conservative amiDo acid substitutions generally maintain (a) the stractuie of the polyp^tide 

bacld^one in the area of the substitution, for exanq)le, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bidk of 
the side chain. 

A "deletion'* refers to a change in the anmno acid or nucleotide sequence that results in the 
30 absence of one or wore amino acid residues or nucleotides. 

The tOTn "derivative" refers to a ch^mcany modified polynucleotide or polypeptide. 
Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by 
an allqd, acyl, bydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which 
retains at least one biological or immunological function of the natural UGolecule. A derivative 
35 polypeptide is one modified by glycosylation, pegylation, or any similar process that retains at least 
one biological or immunological function of the polypeptide from which it was derived. 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
measurable signal and is covalenfly or noncovalentiy joined to a polynucleotide or polypeptide. 

"Differential expression" refers to increased or upregulated; or decreased, downregulated, or 
40 absent gene or protein expression, deterinined by con^aring at least two different san5)les. Such 
conq)arisons may be carried out between, for exan^le, a treated and an untreated sair5)le, or a 
diseased and a normal sample. 



28 



wo 2004/048550 



PCTAJS2003/038178 



"Exon shuffling" ref^s to the recombination of different coding regions (exons). Since an 
exon may represent a stmctaral or functional domain of the ^coded protem, new proteins may be 
assenlbled through the novel reassortment of stable substmctoies, thus allowing acceleration of the 
evolution of new protein functions. 
5 A "fragment" is a unique portion of TRAP or a polynucleotide encoding IRAP vAnch can be 

identical in sequrace to, but shorter in Iraigth than, the parent sequm^e. A fragment noay conopiise 
up to the entire length of the defined sequence, minus one nucleotide/anmno acid residue. For 
exeanpl&t a fragment noay con^irise from about 5 to about 1000 contiguous nucleotides or amino acid 
residues. A fragment used as a probe, primi^, antigen, therapeutic molecule, or for oth^ purposes, 

10 maybe at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous 
nucleotides or amino acid residues in length. Fragments may be pr^rentially selected from certain 
regions of a molecule. For exan^le, a polypeptide fragmasnt may conqxrise a certain length of 
contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a 
polypeptide as shown in a certain defined sequence. Clearly these lengths are exemplary, and any 

15 l^Qgth that is supported by the specification, including the Sequence Listing, tables, and figures, may 
be encon^assed by the pres^ embodiments. 

A fragment of SEQ ID NO:33-64 can comprise a re^on of unique polynucleotide sequence 
that specifically identifies SEQ ID NO:33-64, for example, as distinct from any other sequmce in the 
genome from which the firagment was obtained. A fragment of SEQ ID NO:33-64 can be eDq>loyed 

20 in one or more embodiments of methods of the invention, for example, in hybridization and 

amplification technologies and in analogous methods that distinguish SEQ ID NO:33-64 from related 
polynucleotides. The precise length of a firagment of SEQ ID NO:33-64 and the region of SEQ ID 
NO:33-64 to which the fragmi^ corresponds are routinely determinable by one of ordinary skill in 
the art based on the intended purpose for the firagment 

25 A fragment of SEQ ID NO:l-32 is encoded by a firagment of SEQ ID NO:33-64. A fragment 

of SEQ ID NO:l-32 can conqxrise a region of imique amino acid sequence that specifically identifies 
SEQ ED NO:l-32. For example, a fragment of SEQ ID NO:l-32 can be used as an immunogenic 
peptide for the development of antibodies that specifically recognize SEQ ID NO:l-32. The precise 
length of a fragmmt of SEQ ID NO:l-32 and the region of SEQ ID NO:l-32 to which the fi:agment 

30 corresponds can be determined based on the intended purpose for the fragment using one or more 
analytical methods described hereia or otherwise known in the art 

A ''ftdl length'' polynucleotide is one containing at least a translation initiation codon (e.g., 
methionine) followed by an open reading frame and a translation tenxunation codon. A "full length" 
polynucleotide sequence encodes a "fidl length" polypeptide sequence. 

35 "Homology** refers to sequence sinmlarity or, alternatively, sequence identity, between two 



29 



wo 2004/048550 



PCTAJS2003/038178 



or more polynucleotide sequences or two or more polypeptide sequences. 

The terms "percent identity** and "% identity,** as applied to polynucleotide sequences, refer 
to the percentage of identical nucleotide matches between at least two polynucleotide sequences 
aligned using a standardized algoiithia Such an algorithmmay insert, in a standardized and 
5 r^roducible way, gaps in the sequences bdng compared in order to optimize aligmneut between two 
sequences, and therefore achieve a more meaningful con[q>arison of the two sequences. 

Percent identity betwem polynucleotide sequences may be detemnned using one or more 
computer algorithms or programs known in the art or described herein. For example, percent 
id^ty can be detmnined using the d^ult parameters of the CLUSTAL Y algorithm as 
10 incoiporated into the MEGALIGN version 3. 12e sequmce alignm^ program. This program is part 
of the LASERGENE software package, a suite of noolecular biological analysis prograns 
(DNASTAR, Madison WI). CLUSTAL V is described in Higgins, D.G. and P.M. Sharp (1989; 
CABIOS 5:151-153) and in ffiggms, D.G. et al. (1992; CABIOS 8:189-191). For pairwise 
aligmnraits of polynucleotide sequmces, die default paranoeters are set as follows: Ktuple=:2, gap 
15 pCTalty^5, window^, and "diagonals saved"=4. The "wdghted" residue weight table is selected as 
the drfault 

Alternatively, a suite of commonly used and freely available sequence comparison 
algorithms which can be used is provided by the National Center for Biotechnology Information 
O^CBI) Basic Local Alignment Search Tool (BLAST) (Altschul, S.F. et al. (1990) J. Mol. Biol. 
20 215:403-410), which is available from several sources, including the NCBI, Bethesda, MD, and on 
the Intmiet at ncbLiilm.nih.gov/BLAST/. The BLAST software suite includes various sequence 
analysis programs including "blastn," that is used to align a known polynucleotide sequCTce with 
other polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 
2 Sequences*' that is used for direct pairwise conoparison of two nucleotide sequences. "BLAST 2 
25 Sequences" can be accessed and used interactivdy at ncbi.nkanih.gov/gor^l2 Jitml. The "BLAST 
2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST programs are 
commonly used with gap and other parameters set to default settings. For example, to concpare two 
nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 2.0. 12 
(^ril-21-2000) set at defaidt parameters. Such default parameters may be, for exan^le: 
30 Matrix. BLOSUM62 

Reward far match: 1 
Penalty far ndsmatclu -2 
Open Gap: 5 and Extension Gap: 2 penalties 
Gap X drop-off: 50 
35 Expect: 10 

30 



wo 2004/048550 



PCT/US2003/038178 



Word Size: 11 
Filter: on 

Peccmt identity wsy be measured ov^ the length of an entiie defined seqaeace, for exaiqde, 
as defined by a particular SEQ ID nnidber, or may be measured over a shorts length, for exanqde, 
5 over the length of a fragment talcen from a larger, defined sequence, for mstance, a fragm^ of at 
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Sndi lengths are exen^lary only, and it is luiderstood that any fra 
supported by tihe sequences shown h^in, in the tables, figures, or Sequence listmg, maybe used to 
describe a length ow&: which percentage ideaitity may be measured. 

10 Nucldc acid sequences that do not show a high degree of identity may neverthdess encode 

sinular anmno acid sequences due to the degen^acy of the genetic code. It is understood that 
changes in a nucldc add sequence can be made using this degeneracy to produce noultiple nucleic 
add sequences that all encode substantially the same protein. 

The phrases ''percent identity*^ and '*% identity," as applied to polypeptide sequences, refer 

15 to the percentage of identical residue matches between at least two polypq)tide sequences aligned 
using a standardized algorithm. Methods of polypq>tide sequence dignment are well-known. Some 
alignmi^ methods take into accoxmt conservative anmno add substitutions. Such conservative 
substitutions, e:q)lained in more detail above, generally preserve the charge and hydrophobidty at 
the site of substitution, thus preserving the structure (and therefore function) of the polypeptide. The 

20 phrases "percent similarity'' and similarity," as applied to polypeptide sequences, refer to the 
percentage of residue matches, including identical residue matches and conservative substitutions, 
between at least two polypeptide sequences aligned tising a standardized algorithm. In contrast, 
cons^vative substitutions are not included in the calculation of percent identity between polypeptide 
sequences. 

25 Percmt identitjr between polypeptide sequences may be determined using the default 

parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3. 12e 
sequence alignment program (desmbed and r^erenced above). For pairwise alignments of 
polypeptide sequences using CLUSTAL V, the default parameter are set as follows: Ktuple=l, gap 
penalty^3, window^5, and '^diagonals saved''=5. The PAM250 matrix is selected as the default 

30 residue weigjht table. 

Alternatively the NCBI BLAST software suite may be used. For exaniple, for a pairwise 
comparison of two polypeptide sequences, one may use the '"BLAST 2 Sequences" tool Version 
2.0. 12 (April-21-2000) with blastp set at default parameters. Sudi default parameters may be, for 
exan^le: 

35 Matrix: BLOSUM62 



31 



wo 2004/048550 PCT/US2003/038178 

Open Gap: 11 and Extension Gap: 1 penalties 
Gap X drop-off: 50 
Expect: 10 
Word Size: 3 
5 Filter: on 

Peicent identity inay be measured over the length of an entire defined polypeptide sequence, 
for exanq)le, as defined by a particular SEQ ID number, or maybe measured over a shorter length, 
for example, over the length of a fi*agmrait taken firom a larger, defined polypeptide sequence, for 
instance, a firagment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 

10 150 contiguous residues. Sucb lengths are exencplary only, and it is understood that any fragmrait 
length supported by the sequences shown herdn, in the tables, figures or Sequence listing, may be 
used to desoibe a length over which percentage idraitity may be measured* 

"Human artificial chromosomes" (HACs) are linear microchronmsomes \^ch may contain 
DNA sequences of about 6 kb to 10 Mb in si2se and which contain all of the elements required for 

15 chromosome replication, segregation and ntiaintenance. 

The t^m '^humanized antibody^" refers to an antibodjr molecule in which the amino acid 
sequmce ui the non-antig^ binding regions has been altered so that the antibody more closely 
resembles a hiunan antibody, and still retains its original binding ability. 

"Hybridization" refers to the process by which a polynucleotide strand anneals with a 

20 con^lementary strand through base pairing under defined hybridization condition^^ Specific 
hybridization is an indication that two nucleic acid sequences share a high degree of 
conqil^nentarily. Specific hybridization complexes form under pemussive annealing conditions and 
remain hybridized after the "washing" step(s). The washing step(s) is particularly important in 
determining the string^icy of the hybridization process, with more stringent conditions allowing less 

25 non-specific binding, i.e., binding between pairs of nucl^c acid strands that are not perfectiy 

matched. Permissive conditions for annealing of nucldc acid sequences are routinely determmable 
by one of ordinary sldll in the art and may be consistent among hybridization exp^iments , whereas 
wash conditions may be varied among experiments to achieve the desired stringency, and thmfore 
hybridization spedficity. Penmssive annealing conditions occur, for exanqde, at 68''C in the 

30 presence of about 6 x SSC, about 1% (w/v) SDS, and about 100 |ig/ml sheared, denatured salmon 
sperm DNA. 

Generally, stringency of hybridization is expressed, in part, with reference to the tenoperature 
under wtndh the wash step is carried out. Such wash tenoperatures are typically selected to be about 
5^*0 to 2(fC low&c than the thermal melting point (T^ for the specific sequence at a defined ionic 
35 strength and pH. The T^is the teo^mture (under defined ionic strength and pEQ at which 50% of 

32 



wo 2004/048550 PCT/US2003/038178 

the target sequence liybridizes to a p^fectly matched probe. Aa equation for calculatmg and 
conditions for nucldc acid hybridization are well known and can be found in Sardbrook, J. and D.W. 
Russell (2001; Molecular f!lnnm^! A. Laboratory MannaL 3rd ed., vol. 1-3, Cold Spring Harbor 
Press, Cold Spring Harbor NY, ch. 9). 
5 High stringency conditions for hybridization betwem polynucleotides of the present 

invention include wash conditions of 68X in the presence of about 0.2 x SSC and about 0.1% SDS, 
for 1 hour. Alternatively, ten[q)eratures of about 65°C, 60X, 55°C, or 42**C maybe used. SSC 
coDcratration may be varied from about 0. 1 to 2 x SSC, with SDS bdbog preset at about 0. 1 %. 
lypcaUy, blocldng reagents are used to block non-specific hybridization. Such blocking reagents 

10 inchide, for instance, sheared and denatured salmon sperm DNA at about 100-200 fig/nA. Organic 
solvent^ such as formamide at a concentration of about 35-50% v/v, may also be used under 
paiticolar circumstances, such as for RNA:DNA hybridizations. Useful variations on these wash 
conditions will be readity appar^ to those of ordinary skill in the art. Hybridization, particularly 
under hi^ stringency conditions, may be su gg estive of evolutionary similarity between the 

15 nucleotides. Such siimlarity is strongly indicative of a similar role for the nucleotides and thdr 
CTCoded polypeptides. 

The term '"hybridization con^dex" refers to a complex formed between two nucldic adds by 
virtue of the formation of hydrogen bonds between complementary bases. A hybridization complex 
may be formed in solution (eg,. Cot or Rot analysis) or formed between one nucleic acid pres^ in 
20 solution and anoflier nucldc acid immobilized on a solid support (e.g., paper, membranes, filters, 
chips, pins or glass slides, or any other appropriate substrate to which cells or thdr nucldc acids 
have been fixed). 

The words "insertion'' and "addition" refer to changes in an amino acid or polynucleotide 
sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 

25 "Tmnrmne response" can refer to conditions associated with inflammation, traiuna, immiiTift 

disord^, or infectious or g^ietic disease, etc. These conditions can be characterized by expression 
of various fectors, e.g., cg1x>kines, chemokines, and other signaling molecules, which may affect 
cellular and systeouc defense systems. 

An "immunogenic firagment" is a polyp^tide or oligopeptide fragment of TRAP which is 

30 capable of eliciting an ininiune response when introduced into a living organism, for exanq)le, a 

mammal. The term "immunogenic fragments also includes any polypqptide or oligopeptide fragment 
of IRAP which is useM in any of the antibody production methods disclosed herein or known in the 
art. 

The term "microarray" rofm to an arrangement of a plurality of polynucleotides, 
35 polypq)tides, antibodies, or other chemical conqKiunds on a substrate. 



33 



wo 2004/048550 



PCTAJS2003/038178 



The tertDS "demesit" aod ''array elecDent** refer to a polyiEucleotide, polypeptide, antibody, or 
other chemical coiiq)ouikd having a uniqae and defined position on a rdcroarray. 

The tGrm ''noodalate" refers to a change in the activity of TRAP. For ^ao^le, modulation 
roay cause an increase or a decrease in protein activity, binding characteristics, or any oth^ 
5 biological, fimctional, or immunological properties of IRAP. 

The phrases ''nucldc acid" and '"nucleic acid sequence" refer to a nucleotide, 
oligonucleotide, polynucleotide, or any firagment thereof. These phrases also lesf&c to DNA or RNA 
of genomic or synthetic origin vAnxAi may be single-stranded or double-stranded and may represent 
the sense or the antisense strand, to peptide nucldc acid (PNA), or to ai^ DNA-like or RNA-like 
10 matmaL 

"Operably linked*' refers to the situation in which a first nucleic acid sequence is placed in a 
functional relationship with a second nucleic add sequence. For instance, a pronmoter is operably 
linked to a coding sequence if the promoter affects flie transcription or expression of the coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 

15 necessary to join two protein coding regions, in the same reading frame. 

"Peptide nucleic acid" (PNA) refers to an antisense molecule or anti-grae agmt which 
conoqpiises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide bac!H>one of 
amino acid residues ending in lysine. The tmnmal lysme confers solubility to the con^osition. 
PNAs prefi^ntially bind complementary single stranded DNA or RNA and stop transcript 

20 elongation, and may be pegylated to ext^id their lifespan in the cell. 

"Post-translational modification'* of an IRAP may involve lipidation, glycosj^ation, 
phosphorylation, acetylation, raceanzation^ proteolytic cleavage, and other noodifications known in 
the art These processes may occur synflietically or biochemicalLy. Biochemical modifications will 
vary by cell type depending on the enzymatic rcdlieu of IRAP. 

25 ''Probe** refers to micldc adds encodmg IRAP, their complements, or fragments thereof, 

which are used to detect id^itical, allelic or related nucleic acids. Probes are isolated 
oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical 
labels' include radioactive isotopes, ligands, dienulummescent agents, and enzymes. 'Trimers'* are 
short nucldc acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide 

30 by complementary base-pairing. The primer may then be extended along the target DNA strand by a 
DNA polymerase enzyme. Primer pairs can be used fen: an^>lification (and identification) of a 
nucldc acid, e.g., by the polymerase chain reaction (PCR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
nucleotides of a known sequence. In order to enhance spedfidty, longer probes and primers may 

35 also be employed, sudi as probes and pribmers that con^rise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 

34 



wo 2004/048550 



PCT/US2003/038178 



100, or at least 150 consecutive micleotides of the disclosed nucleic acid sequeaces. Probes and 
primers may be considerably longer than these examples, and it is understood that any length 
supported by the specification, includrog the tables, figures, and Sequence Listing, may be used. 
Mefliods for preparing and using probes and primisrs are described in, for ^an^le, 
5 Sambrook, J. and D.W. Russdl (2001; Molecular r imiiup;! A Laboratory Manual , 3rd ed., voL 1-3, 
Cold Spring Harbor Press, Cold Spring Harbor NY), Ausubel, RM, et aL (1999; Short Protocols in 
Molecular Biology. 4^ ed., John Wiley & Sons, New York NY), and Innis, M. et al. (1990; PCR 
Protocols- A Guide to Methods and Applications , Academic Press, San Diego CA). PCR primer 
pairs can be derived firom a known sequence, for cx&nsplc^ by using conQniter programs intended for 
10 that purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, 
Cambridge MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OLIGO 4.06 software is useftil for the selection of PCR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 

15 5,000 nucleotides from an i]q>utpo]yaucleotide sequence of up to 32 kfl Sin^ar primer 

sdection programs have incorporated additional features for expanded capabilities. For exanq>le, the 
PrimDU prinoer selection program (available to the public from the Genome Center at Universily of 
Texas South West Medical Center, DaUas TX) is capable of cboosing specific primers from 
megabase sequences and is thus useftil for designing primers on a genome-wide scope. The Primer3 

20 primer selection program (available to the public from the Whitehead Institute/MTT Center for 
G^iome Research, Cambridge MA) allows tiie us^ to iiq)ut a "misprining library," in which 
sequences to avoid as primar binding sites are user-specified. PrimerS is useftil, in particular, for the 
selection of oligonucleotides for microarrays. (Jh& source code for the latter two primer selection 
programs may also be obtained from thdr respective sources and modified to meet the us^'s specific 

25 needs.) The PrimeGen program (available to the public from the UK Human Graome Mapping 
Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 
thereby allowing selection of primers that hybridize to d.ther the most conserved or least conserved 
regions of aligned nucldc add sequmces. Hence, this program is useftil for identification of both 
unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and 

30 polynucleotide fragments identified by any of the above selection methods are useful in 

.hybridization technologies, for example, as PCR or sequencing primers, nncroarray elements, or 
specific probes to identify ftiUy or partially conoplonentary polynucleotides in a sanqde of nucldc 
acids. Methods of oligonucleotide selection are not limited to those desmbed above. 

A ^'recombinant nucldc acid" is a nucldlc acid that is not naturally occurring or has a 

35 secpience that is made by an artificial coiifl>ination of two or more otherwise separated segments of 



35 



wo 2004/048550 PCTAJS2003/038178 

segueoce. This artificial cotDbioation is often accox^plislied by chemical synthesis or, more 
commDnly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 
engineenQg techniques such as fliose described in Sambrook and Russell (supra). The term 
recombinant includes nucleic acids that have bem altered solely by addition, substitution, or deletion 
5 of a portion of the nncldc acid. Frequently, a recombinant nucldc acid may include a nucleic acid 
sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a 
vector that is used, for exan^le, to transform a cell. 

Alternatively, such recombinant nucldc acids may be part of a viral vector, e.g., based on a 
vaccinia virus, fliat could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
10 expressed, inducing a protective immunological response in the msmrmsil. 

A "regulatory element" refors to a nuclric acid sequence usually derived fit^mimtranslated 
regions of a gene and includes enhance, promoters, introns, and 5' and 3* untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 
translation, or RNA stability. 
15 ^'Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 

anuno acid, or antibody. Reporter molecules include radionuclides; mzymes; fluorescent, 
chemihiminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
. other moieties known in the art 

An ""RNA equivalent," in reference to a DNA molecule, is composed of the same linear 
20 sequence of nucleotides as the reference DNA molecule with the exception that all occurrences of 
the mtrogenous base thyoune are replaced with uracil, and the sugar backbone is conq)osed of ribose 
instead of deoxyrifoose. 

The term ^'san^le" is used in its broadest s^e. A sample suspected of containing IRAP, 
nucleic acids mcoding IRAP, or firagmoots thereof may comprise a bodily fluid; an extract from a 
25 ceil, chromosome, organelle, or membrane isolated from a cell; a cell; g^iomic DNA, RNA, or 
cDNA, in solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms ^'specific binding" and "specifically biaoding" refer to that interaction between a 
protdmi or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
synthetic binding composition. The interaction is d^^ident upon the presmce of a particular 
30 strocture of tiie protein, e.g., the antigenic determinant or epitope,' recognized by the binding 
molecule. For exanople, if an antibody is spedfic for epitope "A," the presence of a polypeptide 
conq>rising the epitope A, or the presence of firee unlabeled A, in a reaction containing free labeled A 
and the antibody will reduce the amount of labdled Athatbinds to the antibody. 

The tmn ''substantially purified" refers to imcleic add or amino acid sequences that are 
35 removed from thdr natural environment and are isolated or separated, and are at least about 60% 



36 



wo 2004/048550 PCTAJS2003/038178 

firee, preferably at least about 75% free, and most preferably at least about 90% free from other 
conqponents with ^idch they are naturally associated. 

A "substitution*' refers to the replacen^t of one or noore amino acid residues or nucleotides 
by difri^ent armno add residues or nucleotides, respectively. 
5 "Substrate" refers to any suitable rigid or senn-iigid support including membranes^ filters, 

chips, slides, wafers, fibers, magnetic or nornnagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. Hie substrate can have a variety of surface forms, such as wells, 
trenclies, pins, channels and pores, to ^^ch polynucleotides or polypeptides are boimd. 

A 'transcript image" or **«pression profile" refers to the collective pattern of gCTe 
10 expression by a particular cell type or tissue under giv^ conditions at a givm tim&. 

"Transformation" describes a process by which exogenous DNA is introduced into a 
recipimt cell. Transformation may occur under natural or artificial conditions according to various 
methods wdl known in the art, and may rely on any known inethod for the insertion of foreign 
nucleic acid sequences into a prokaryotic or eukaryotic host cdH. The noethod for transformation is 
15 selected based on the type of host cell bdng transformed and may include, but is not limited to, 

bacteriophage or viral infection, dectroporation, heat shock, Upofection, and particle bombardment. 
The tmn 'transformed cells" includes stably transformed cells in which the inserted DNA is capable 
of replication eHber as an autonomously r^licatiiog plasmid or as part of the host chromosome, as 
wen as transienfly transformed ceDs \;^ch express the inserted DNA or RNA for limited pmods of 
20 time. 

A "transgenic organism," as used herein, is any organism, including but not limited to 
animals and plants, in which one or more of the cells of the organism contains heterologous nucleic 
acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art The nucleic add is introduced into flie cell, directly or indirectiy by introduction into a 

25 precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by 
infection with a recombinant virus. In another exnbodiment, tiie nucleic acid can be introduced by 
infection with a recombinant viral vector, such as a lentiviral vector (Lois, C. et aL (2002) Science 
295:868-872). The term genetic manipulation does not include classical cmss-breeding, or in vitro 
fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 

30 transgenic organisms contenq>lated in accordance with the present invention include bacteria, 
'cyanobacteria, fungi, plants and animals. Ihe isolated DNA of the present invention can be 
introduced into the host by methods known in the art, for exanqile infection, transfection, 
transformation or transconjugation. Techniques for transferring the DNA of the present invention 
into such organisms are widely known and provided in references such as Sanobrook and Russell 

35 (supra). 

37 



wo 2004/048550 PCT/US2003/038178 

A ^'variant" of a particular nucleic acid sequence is defined as a nucldc acid sequence 
liaviDg at least 40% sequence identity to the particular nucleic acid sequence over a c^tain length of 
one of the nucldc add sequences nsing blastn "mih llie "BLAST 2 Sequences" tool Ve^ion 2.0.9 
(Ma7-07-1999) set at default paran^ters. Such a pair of nucldc adds noay show, for example, at 
5 least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 
92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 
99%orgreaterseqa»]ceidentity over a certain defmedl^igfh. Avariantinaybe described as, £Dr 
exan^le, an "allelic** (as defined above), "splice,** "species," or "polynoorphic** variant. A splice 
variant may have significant id^itity to a reference molecule, but will gmerally have a greater or 

10 lesser number of polynudeotides due to altemate splicing during mRNA processing. The 

corresponding polypeptide may possess additional functional domains or lack domains that are 
present in the reference molecule. Spedes variants are polynucleotides that vary from one species to 
anoth^. The resulting polyp^tides will gen^aSy have significant amino acid identity rdative to 
eacli other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene 

15 between individuals of a given species. Polymorphic variants also may ^icompass "single 

nucleotide polymorphisns" (SNPs) in which Che polynucleotide sequence varies by one nucleotide 
base. The presence of SNPs nsay be indicative of, for exan^le, a certain population, a disease state, 
or a propensity for a disease state. 

A "variant" of a particular polyp^tide sequence is defined as a polypeptide sequence having 

20 at least 40% sequence identity or sequ^ce sinularity to the particular polypeptide sequ^ice over a 
cotain length of one of the polypeptide sequmces using blastp with the "BLAST 2 Sequmces** tool 
Version 2.0.9 (May-07-1999) set at default parameters. Sudi a pair of polypeptides may show, for 
example, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 
91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, 

25 or at least 99% or greater sequence identity or sequence similarity over a certain defined length of 
one of the polypeptides. 

THE INVENTION 

Various eoibodimi^ts of the invention include new human immune response associated 
30 proteins (TRAP), the polynucleotides ^coding IRAP, and the use of these couopositions for the 
diagnosis, treatment, or prevention of immune system, neurological, developmental, muscle, cdl 
proliferative disordm, and disorders of lipid metabolism. 

Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
CTobodiments of the invention. Eadi polynucleotide and its corresponding polypeptide are correlated 
35 to a single Inic^ project idra]tificationniuiiber(IncytePro^ Badi polypeptide sequence is 



38 



wo 2004/048550 



PCT/US2003/038178 



denoted by both a polypeptide seqaence identification nnidber (Polypeptide SEQ ID NO:) and an 
Incyte polypeptide seqaraice nutnber (Incj^ Polypeptide ID) as shown. Each polynucleotide 
sequence is denoted by both a polynucleotide sequence id^itification nunober (Polynucleotide SEQ 
ID NO:) and an Inc^^ polynucleotide consensus sequence nundber (Incyte Polynucleotide ID) as 
5 shown. 

Table 2 shows sequaices with homology to polypeptide enibodinxsnts of fhe invention as 
idmtified by BLAST analysis against the GeoBank protedn (geiqiept) database and the PROTEOME 
database. Columns 1 and 2 show the polypeptide sequence identification nundber (Polypeptide SEQ 
ID NO:) and the corresponding bicyte polypeptide sequence nunober (Incyte Polypeptide ID) for 

10 polypeptides of the invention. Column 3 shows fhe GenBank identification nunob^ (GenBank ED 
NO) of the nearest GenBank homolog and the PROTEOME database identification nunobers 
(PROnSOME ID NO:) of fhe nearest PROTEOME database homologs. Column 4 shows the 
probabilily scores for fhe matches between each polypeptide and its bon]0log(s). Column 5 shows 
fbe annotation of fhe GenBank and PROTEOME database hQmDlog(s) along with relevant dtations 

15 ^/here applicable^ all of which are expressly incorporated by ref^^nce herdn. 

Table 3 shows various structural f&atures of fhe polypqptides of fhe invrationu Columns 1 
and 2 show tiie polypeptide sequ^ce identification nundber (SEQ ID NO:) and fhe corresponding 
Incg^ poi^eptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. 
Column 3 shows fhe nuniber of amino add residues in each polypeptide. Column 4 shows amino • 

20 acid residues comprising signature sequences, domains, motifs, pot^itial phosphorylation sites, and 
potential glycos^dation sites. Column 5 shows analytical methods for protein structure^unction 
analysis and in some cases, seardiable databases to which fhe analytical methods were applied. 

Together, Tables 2 and 3 summarize fhe prop^es of polypeptides of the invention, and 
these properties establish tiiat fhe claimed polypeptides are immune response assodated protdns. For 

25 example, SEQ ED NO:4 is 99% id^cal, fromresidue Ml to residue M391 , and 100% identical 
from residue V392 to residue K450, to hunoan bactericidal p&moeability increasing protein (BPI) 
precursor (GenBank ID gl79529) as detemmned by the Basic Local AUgnment Search Tool 
(BLAST). (See Table 2.) The BLAST probability score is 1 .7e-236, which indicates flie probability 
of obtaining fhe observed polypeptide sequence alignment by chance. SEQ ID NO:4 also has 

30 homology to proteins fhat are localized to the plasma membrane, have lipopolysaccharide (LPS)- 
binding function, and arebactericidal/pemoeability-increasing protdns, as detemnned by BLAST 
analysis using fhe PROTEOME database. SEQ ID NO:4 also contains BPI/LBP/CETP domains, and 
LBP/BPI/CKrP family domains as determined by searching for statistically significant matches in 
fhe hiddm Markov model (HMM)-based PFAM and SMART databases of conserved protein 

35 families/domains. (See Table 3.) Data from BLIMPS, MOTIFS, and PROFILESCAN analyses, and 

39 



wo 2004/048550 



PCT/US2003/038178 



BLAST analyses against the PRODOM and DOMO databases, provide further conoborative 
evidence that SEQ ID NO:4 is a bactericidal permeability iocreasing protein. 

As another example, SEQ ED NO:8 is 83% id^cal, fromresidue Ml to residue M334, to 
human properdin (OenBank ID g35678) as detemnned by the Basic Local Aligomrat Search Tool 
5 (BLAST). (See Table 2.) The BliASTprobabiUty score is 3.9e-164, wWch indicates the pro^ 
of obtaining die observed polypeptide sequence aligmneot by chance. SEQ ID NO:8 also has 
homology to protdns that are extracdlular, have structural functions, and are properdin factors, as 
determined by BLAST analysis using the PROTEOMB database. SEQ ID NO:8 also contains a 
throndbospondin typo 1 repeats domain as determfaied by searching for statistically significant 

10 matches in the hidden Marloov model CHMM)-based SMART database of conserved protein 

f anmlies/domains and a thronibospondin type 1 domain as det^mined by searching for statisticaUy 
significant matches in the hiddw Markov modd (HMM)-based PFAM database of conspired protdn 
fatmlies/domains . (See Table 3.) Data firom BLIMPS and MOTIFS analyses, and BLAST analysis 
against the PRODOM and DOMO databases, pro\dde fiirth^ corroborative evidence that SEQ ID 

15 NO:8 is a properdin. 

As anoth^ example, SEQ ID NO:23 is a splice variant of human iDterleuldn-2 (GenBank ID 
g33781) as detennined by die Basic Local AUgnmeDt Search Tool (BI^l^^ (See Table 2.) The 
BLAST probability score is 9.3&-62, which indicates the probability of obtaining the observed 
polypeptide sequence alignment by chance. SEQ ID NO:23 also has homology to protdns that are 

20 tDterleuldn'-2, T-cell-derived cytokines that proioote activation and proliferation of lyn^hocytes, are 
involved in the immune response, are inoplicated in Sjorgea^s syndrome* autoimmune hemolytic 
anemia, and multiple sclerosis, as detmnined by BLAST analysis using the PROTEOME database. 
SEQ ID NO:23 also contains an interleuldn-2 domain as det^mined by searching for statistically 
significant matches in the hidden Markov model (HMM)-based PFAM and SMART databases of 

25 conserved protdn fannlies/domains. (See Table 3.) Data from BLIMPS, MOTIFS, and 

PROFILESCAN analyses, and BLAST analyses against the PRODOM and DOMO databases, 
provide further corroborative evidence that SEQ ID NO:23 is an int^leukin-2. 

For example, SEQ ID NO:31 is a splice variant of human pentaxin (GenBank ID g35797) as 
determined by the Basic Local Alignnoent Search Tool (BLAST). (See Table 2.) The BLAST 

30 probability score is 1.5e-131, ^^ch indicates the probability of obtaining the observed polypeptide 
sequence alignment by chance. SEQ ID NO:31 also has homology to proteins that play roles in 
inflammation and the bacterial defense response, may limit antoiasmune reactions during 
inflammation, and are menfl>ers of the peotaxin fannly of acut&-pliase proteins, as determined by 
BLAST analysis using the PROTEOMB database. SEQ ID NO:31 also contains a pentaxin family 

35 domain as determined by searching for statistically significant matches in die hidden Markov model 

40 



wo 2004/048550 



PCT/US2003/038178 



(HMM)-based SMART and PFAM databases of conserved protdn fiannlies/doimuis. (See Table 3.) 
Data firom BLIMPS, MOTIFS, and PROIILESCAN analyses, and BLAST analyses against the 
PRODOM and DOMO databases, provide furthi^ corroborative evidence that SEQ ID NO:31 is a 
pentaxin. 

5 SEQ ID NO:l-3, SEQ ID NO:5-7, SEQ ID NO:9-22, SEQ ID NO:24-30, and SEQ ID NO:32 

were analyzed and annotated in a sinmlar noann^. Hie algorithms and parannet^ for the analysis of 
SEQ ID NO:l-32 are desciibed in Table 7. 

As shown in Table 4, the full length polynucleotide enibodiiDents were asseoobled using 
cDNA sequences or coding (exon) sequences derived from genomic DNA, or any combination of 

10 these two types of sequences. Cohunn 1 lists the polynucleotide sequoce id^otification number 
(Polynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide consensus sequence number 
(Incyte ID) for each polynucleotide of the invCTtion, and the l^igth of each polynucleotide sequraice 
inbasepairs. Column 2 shows the nucleotide start (5') and stop (3') positions of the cDNA and/or 
genonuc sequ^ices used to assemble the full l^igth polynucleotide ^aabodinoBnts, and of fragments 

15 of the polynucleotides which are useful, for example, in hybridization or anq)lification technologies 
that identify SEQ ID NO:33-64 or that distinguish between SEQ ID NO:33-64 and related 
polynucleotides. 

The polynucleotide fragments described in Column 2 of Table 4 may refear specifically, for 
exanq>le, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA 

20 libraries. Alternatively, the polynucleotide fragments described in column 2 may refer to GenBank 
cDNAs or ESTs which contributed to the assembly of the full length polynucleotides. In addition, 
the polynucleotide fragments described in column 2 may identify sequences derived from the 
ENSEMBL (The Sang^ C^tre, Cani»ridge, UK) database (Le.^ those sequences including the 
designation "ENST"). Alternatively, the polynucleotide fragments described in column 2 may be 

25 derived from the NCBI RefSeq Nucleotide Sequence Records Database (Le. , those sequences 

iuGluding the d^ignation "NM" or "NT") or the NCBI RefSeq Protdn Sequ^ice Records those 
sequences including the designation "NP")- Alternatively, the polynucleotide firagno^ots described in 
column 2 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by 
an *'exon stitching" algorithm. For exanqple, a polynucleotide sequence identified as 

30 m.JOOOOaCJSfiJSf2JnnnnrjfsJf4 represents a "stitched" sequence in which XXXXXX is flie 
identification nunober of the cluster of sequences to which the algorithm was applied, and ITTlTis 
the nunib^ of the prediction gCTerated by the algorithm, and Nj^^^^ if present, represent specific 
«ons that may have been manually edited during analysis (See Exanaple V). Altmiatively, the 
polynucleotide fragments in column 2 may refer to asseniblages of exons brought together by an 

35 '*exon-stretching" algorithm For exan^le, a polynucleotide sequence identified as 

41 



wo 2004/048550 



PCTAJS2003/038178 



FLXXXXX3SLgAAAAA_gBBBBB_l JNT is a "stretched" sequaace, \vith XXXXXXbdmg the Incyte 
pioject idCTtification nmnber, gAAAAA being the G^Bank idraitificatLoii mmiber of the bmnan 
genomic sequence to which the "exon-stretchmg" algorithm was applied, gBBBBB bring the 
GenBank identification nundber or NCBI RefSeq identification nonoiber of the nearest GenBank 

5 protdn lioniolog, and N referring to spedfic exons (See Bxanq>le V). Jn instances ^ere a RefSeq 
sequence was used as a protein bomolog for the '"eKon-stretching" algorithm, a RefSeq identifi^ 
(denoted by "NM," "NP/' or "NT") n»y be used in place of the GenBank identifiear (jLe. , gPBBBB). 

Alternatively, a prefix identifies component sequences that were hand-edited, predicted from 
genonic DNA sequences, or dmved firom a cocdbination of sequence analysis methods. The 

10 following Table lists exanqples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see BKan^le IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN, GFG, 
ENST 


Exon prediction fi^om genomic sequences nslng, for exanqde, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Con5)Uter Genomics Gronp, The Sanger Centre, Cambridge, UK). 


GBI 


Hand-edited analysis of genomic sequences. 


FL 


Stitched or stretched genomic sequences (see Example V). 


BSrCY 


Full length transcript and exon prediction fiom mapping of EST 
sequences to the genonoe. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript 



In some cases, Incyte cDNA coverage redundant with the sequMice coverage shown in Table 
20 4 was obtained to confirm the final consensus polynucleotide sequence, but the relevant Incyte 
cDN A identification numbers are not showa 

Table 5 shows the representative cDNA libraries for those full length polynucleotides which 
were asserribled using Incyte cDNA sequences. The r^resentative cDNA library is the Incyte cDNA 
library which is most frequently represented by the Incyte cDNA sequences which were used to 
25 ass^nble and confirm the above polynucleotides. The tissues and vectors which were used to 
constmct the cDNA libraries shown in Table 5 are described in Table 6. 

Table 8 shows single nucleotide polymorphisms (SNPs) found in polynucleotide sequences 
of the inv^idon, along with allde frequencies in different human populations. Columns 1 and 2 
show the polynucleotide sequ^e identification nomber (SEQ ID NO:) and the corresponding Incyte 
30 project identification number (PID) for polynucleotides of the invention. Column 3 shows the Incyte 
identification number for the EST in vMch the SNP was detected (EST ID), and column '4 shows the 
identification nunober for the SNP (SNP ID). Column 5 shows the position within die EST sequence 



42 



wo 2004/048550 



PCT/US2003/038178 



at which the SNP is located (EST SNP), and colxram 6 shows the positLon of the SNP within the ftdl- 
lengfh polynucleotide sequence (CB 1 SNP). Column 7 shows the allele found in the EST sequence. 
Cohuxms 8 and 9 show the two alleles found at the SNP site. Column 10 shows the amino acid 
encoded by the codon including the SNP site, based upon the allele found in the EST. Columns 1 1- 

5 14 show the firequency of allele 1 in four difiEexent human populations. An entry of n/d (not 

detected) indicates that the firequency of allele 1 in the population was too low to be detected, while 
n/a (not available) indicates that the allele firequency was not detexmined for the population. 

The invention also encompasses IRAP variants. Various enbodiments of IRAP variants can 
have at least about 80%, at least about 90%, or at least about 95% axnmo acid sequence identity to 

10 the IRAP amino acid sequence, and can contain at least one fiu^ 
IRAP. 

Various ^nbodim^its also encon^ass polynucleotides \;^ch encode IRAP. In a particular 
ecdbodimBnt, the invention enconopasses a polynucleotide sequence conrqprising a sequence selected 
from the group consisting of SBQ ID NO:33-64, which encodes IRAP. The polynucleotide 

15 sequences of SEQ ID NO:33-64, as presented in the Sequence Listing, mibrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thynnne are replaced with uracil, and the 
sugar backbone is conaposcd of ribose instead of deoxyribose. 

The invention also »iconq>asses variants of a polynucleotide encoding IRAP. In particular, 
such a variant polynucleotide will have at least about 70%, or alternatively at least about 85%, or 

20 even at least about 95% polynucleotide sequence identity to a polynucleotide encoding IRAP. A 
particular aspect of the invention encompasses a variant of a polynucleotide ccnnprisiiig a sequence 
selected from the group consisting of SEQ ID NO:33-64 which has at least about 70%, or 
alternatively at least about 85%, or even at least about 95% polynucleotide sequence identity to a 
nucleic acid sequ»ce selected firom the group consisting of SEQ ID NO:33-64. Any one of the 

25 polynucleotide variants described above can encode a polypeptide which contains at least one 
fiinctional or structural characteristic of IRAP. 

In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant 
of a polynucleotide encoding IRAP. A splice variant may have portions "v^ch have significant 
sequ^ice identity to a polynucleotide encoding IRAP, but win generally have a gieatea: or lesser 

30 number of nucleotides due to additions or deletions of blocks of sequence arising from alternate 

splicing during noRNA processing. A splice variant may have less than about 70%, or alternatively 
less than about 60%, or alternatively less than about 50% polynucleotide sequence identity to a 
polynucleotide encoding IRAP over its entire length; however, portions of the splice variant win 
have at least about 70%, or alternatively at least about 85%, or altemativdy at least about 95%, or 

35 alternatively 100% polynucleotide sequence identity to portions of die polynucleotide encoding 

43 



wo 2004/048550 



PCT/US2003/038178 



IRAP. For GSLaxnpl&y a polynucleotide coirqprisiiig a sequCTce of SEQ ID NO:34 and a polynucleotide 
conDprisiog a sequence of SEQ ID NO:36 are splice variants of each oth^; a polynucleotide 
conq)risiQg a sequence of SEQ ID NO:35 and a polynucleotide conq>ri5ing a sequence of SEQ ID 
NO:37 are splice variants of each other; a polynucleotide coir5)risiug a sequence of SEQ ED NO:39 
5 and a polynucleotide con^rising a sequence of SEQ ID NO:40 are splice variants of each other; and 
a polynucleotide conoprising a sequmce of SEQ ID NO:33 and a polynucleotide con;)rising a 
sequence of SEQ ED NO:47 are splice variants of each other. Any one of the splice variants 
described above can encode a polypeptide which contams at least one functional or structural 
characteristic of IRAP. 

10 It vnH be appreciated by those skilled in the art that as a result of the degeneracy of the 

g^ietic code, a multitude of polynucleotide sequences encoding IRAP, some bearing ininimal 
similari^ to the polynucleotide sequ«.ces of any known and naturally occurring gene, noay be 
produced. Thus, the invention contenqdates each and ev^ possible variation of polynucleotide 
sequence that could be made by selecting combinations based on possible codon choices. These 

15 combinations are made in accordance \^fh the standard triplet genetic code as applied to the 

polynucleotide sequence of naturally occurring IRAP, and all such variations are to be considered as 
bdng specifically disclosed. 

Although polynucleotides which encode IRAP and its variants are genially capable of 
hybridizing to po^mucleotides encoding naturally occurring IRAP under appropriately sdLected 

20 conditions of stringency, it may be advantageous to produce polynucleotides encoding IRAP or its 
derivatives possessing a substantially diKerent codon usage, e.g., inclusion of non-naturally 
occurring codons. Codons may be selected to increase the rate at which expression of the peptide 
occurs in a particular prokaryotic or eiikaryotic host in accordance with the frequency with v/iAch 
particular codons are utilized by the host Other reasons for substantially altering tiie nucleotide 

25 sequence encodiiig IRAP and its derivatives without altering the encoded armno acid sequences 
include the production of RNA transcripts haviiig more desirable properties, such as a greater 
half-life, than transoipts produced fix>m the naturally occurring sequence. 

The invention also encompasses production of polynucleotides which encode IRAP and 
IRAP dmvatives, or firagments thereof, entirely by synthetic chennstry. After production, tiie 

30 synthetic polynucleotide may be inserted into any of the rmay available expression vectors and cell 
systenos using reagents wdl known in the art Moreover, synthetic chenoistiy may be iised to 
introduce noutations into a polynucleotide encoding IRAP or any fragment thereof. 

Eoabodimen t s of the invention can also include polynucleotides that are capable of 
h3^ridizing to Ihe claimed polynucleotides, and, in particular, to those having the sequences shown 

35 in SEQ ID NO:33-64 and fragments tb^^of, under various conditions of stringency (Wahl, G.M. and 

44 



wo 2004/048550 



PCT/US2003/038178 



S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A-R. (1987) Methods EnzymoL 
152:507-51 1). Hybridization conditioBS, mcIudiBg annealing and wash conditions, are described in 
"Definitions." 

Methods for DNA sequencing are well known in the art and may be used to practice any of 
5 the embodiments of the invention The methods may estnploy such enzymes as the Klenow fragment 
of DNA polym^ase I, SBQUENASE (US BiochCTaical, Qeveland OEO, Taq polymerase (A{>plied 
Biosystems), thermostable T7 polymerase (An^rsham Biosciences, Piscataway NJ), or cocQbinatLons 
of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification 
system (Invitrogen, Carlsbad CA). Preferably, 'sequence preparation is automated with machines 

10 sudi as die MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), PTC200 thermal cycler 
(Ml Research, Watertown MA) and ABI CATALYST 800 thermal cycler (^plied Biosystems). 
Sequencing is then carried out using dther the ABI 373 or 377 DNA sequencing system (^plied 
Biosyst^ns), the MBGABACE 1000 DNA sequencnig system (Amersham Bioscimces), or other 
systrans known in the art llie resultmg sequences are analyzed iising a variety of algorithms 

15 are well known in the art (Ausubd et aL, supra, ch. 7; Mqras, R.A. (1995) Molecular Binlof ry and 
Biotechnology. Wiley VCH, New York NY, pp. 856-853). 

Hie nucleic adds encoding IRAP may be extended utilizing a partial nucleotide sequence 
and en^loying various PCR-based methods known in the art to detect upstream sequences, such as 
promoters and regulatory elements. For cxsanplo, one method which may be employed, 

20 restriction-site PCR, uses universal and nested primers to ainplify unknown sequence from genomic 
DNA within a cloning vector (Sarkar, G. (1993) PCR Methods i^plic. 2:318-322). Another method, 
inverse PCR, uses primers that extend in divergent directions to an^lify unknown sequence from a 
circularized template. The teoiplate is derived fromrestriction fragments comprising a known 
gjSDDjmc locus and surrounding sequences (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). A 

25 third method, capture PCR, involves PCR amplification of DNA firagnmits adjacent to known 
sequences inhuman and yeast artificial chromosome DNA (Lagerstrom, M. et aL (1991) PCR 
Methods Applic. 1:111-119). In this metiiod, multiple restriction enzyme digestions and ligations 
may be used to insert an engineered double-stranded sequence into a region of tmknown sequence 
before perfornung PCR. Other mediods which may be used to retrieve unknown sequences are 

30 known in the art (Parker, J.D. et aL (1991) Nuclric Acids Res. 19:3055-3060). Additionally, one 
may use PCR, nested primers, and PROMOTERFINDER libraries (BD Clontedi, Palo Alto CA) to 
walk genomic DNA. This procedure avoids the need to screen libraries and is us^uL in finHmg 
inbron/exon junctions. For all PCR-based methods, pruners may be designed using commercially 
available software, such as OLIGO 4.06 pruner analysis software (National Biosciences, Plymouth 

35 MN) or another appropriate program, to be about 22 to 30 nucleotides in leaigth, to have a GC 

45 



wo 2004/048550 



PCT/US2003/038178 



content of about 50% or more, and to anneal to the t^xiplate at ten:q)eratiires of about 68*^0 to 72**C. 

When Sdeening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs. In addition, randomrprimed libraries, "^ch o&en include 
sequences containing the 5* regions of g^ies, are preferable for situations in which an oligo d(T) 
5 libraiy does not yield a full-length cDNA. Genomic libraries may be useful for extension of 
sequence into 5' non-transcribed regulatory regions. 

Capillary electrophoresis systenos which are commercially available may be used to analyze 
the size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary 
sequencing may moploy flowable polymers for electrophoretic separation, four diffeaiCTt nucleotide- 

10 specific, laser-stimulated fluorescent ^es, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be conv^ted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
process fromloading of sanqfles to corqniter analysis and electronic data display may be computer 
controlled. Capillary electrophoresis is especially preferable for sequ^icing small DNA firagments 

15 which may be present in linnted amounts in a particular sanq)le. 

In anoth^ endbodiment of the invention, polynucleotides or fragments thereof which CTcode 
IRAP may be cloned in recombinant DNA noolecules that direct expression of IRAP, or firagments or 
functional equivalents thereof, in appropriate host cells. Due to the iidmient degmeracy of the 
genetic code, other polynucleotides which encode substantially the same or a functionally eq[uivalent 

20 polyp^tides maybe produced and used to express IRAP, 

The polynucleotides of the invention can be en^boeered using methods generally known in 
the art in order to alter IRAP-encoding sequences for a variety of purposes including, but not limited 
to, modification of the cloning, processing, and/or expression of the gene product DNA shuffling by 
random fragmentation and PCR reassembly of gene firagments and synthetic oligonucleotides may be 

25 used to engineer the nucleotide sequences. For exanople, oligonucleotide-mediated site-directed 

mutagenesis noay be used to introduce mutations that create new restriction sites, alter glycosylation 
patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of flie present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Oara CA; described in U.S. Patent No. 

30 5,837,458; Chang, C.-C. et al. (1999) Nat Biotechnol. 17:793-797; Christians, F.C. et al, (1999) Nat. 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat Biotechnol. 14:315-319) to alter or 
improve the biological properties of IRAP, such as its biological or enzymatic activity or its abflity 
to bind to other molecules or conopounds. DNA shuffling is a process by which a library of gene 
variants is produced using PCR-mediated recombination of gene firagments. The library is then 

35 subjected to selection or screcmng procedures that identify those gene variants with the desired 

46 



wo 2004/048550 PCTAJS2003/038178 

properties. Tliese preferred vanants may then be pooled and further subjected to recursive rounds of 
DNA shxifOing and selection/screening. Thus, genetic diversity is created trough "artificial" 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and tiien reshufQed until the desired properties are 
5 optimized. Alternatively,fragn3entsof a given gene niay be recombined with fragment 
homologous genes in the same gene family, eiflier from the same or different species, fliereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
manner. 

In another CTdbodiment, polynucleotides encoding IRAP may be synthesized, in \^ole or in 

10 part, using one or more checmcal methods well known in the art (Caruthers, M.H. et al. (1980) 

Nucldc Acids Symp. Ser. 7:215-223; Horn, T. et aL (1980) Nucleic Adds Syn^. Ser. 7:225-232). 
Alternatively, IRAP itself or a fragment thereof may be synthesized using chemical methods known 
in the art For exanq)le, peptide synthesis can be p^ibrmed using various solution-phase or 
solid-phase techmques (Crdghton, T. (1984) Pmt&itig^ Stmctures and Molecular Propoties, WH 

15 Freeman, New York NY, pp. 55-60; Roberge, J.Y. et aL (1995) Science 269:202-204). Automated 
syothesis may be achieved using the ABI 431 A peptide synthesizer (Applied Biosj^stems). 
Additionally, the amino add sequence of IRAP, or easy part th^eof, maybe altered during direct 
synthesis and/or combined with sequences from other proteins, or any part thcareof, to produce a 
variant polypeptide or a polypeptide having a sequrace of a naturally occurring polypeptide. 

20 The peptide may be substantially purified by preparative high performance liquid 

chromatography (CSiiez, R.M. and F.Z. Regnier (1990) Methods Enzymol. 182:392-421). The 
con^osition of the synthetic peptides may be confirmed by ancdno acid analysis or by sequencing 
(Crdghton, supra^ pp. 28-53). 

In order to express a biological^ active IRAP, the polynucleotides encoding IRAP or 

25 derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which 

contains the necessary elements for transoiptional and translational control of the inserted coding 
sequCTce in a suitable host These el^nents include regulatory sequences, such as enhancers, 
constitutive and inducible promoters, and 5' and 3' untranslated regions in the vector and in 
polynucleotides encoding IRAP. Such dCTOi^its may vary in their strength and specificity. Specific 

30 initiation signals may also be used to achieve more effident translation of polynucleotides encoding 
IRAP. Such signals include the ATG initiation codon and adjacent sequences, e.g. the Kozak 
sequence. In cases where a polynucleotide sequence ^icoding IRAP and its initiation codon and 
upstream regulatory sequences are inserted into the appropriate expression vector, no additional 
transcriptional or translational control signals may be needed. Howev^, in cases where only coding 

35 sequence, or a fragment thereof, is inserted, exogenous translational control signals including an in- 



47 



wo 2004/048550 PCT/US2003/038178 

framQ ATG initiatioii codon should be provided by the vector. Exogmous translational denients and 
initiation codons may be of various origins, bolii natural and synthetic. The efficiency of expression 
may be enhanced by the inclusion of ^diancers appropriate for the particular host cell system used 
(Schaif, D. et al. (1994) Results Probl. Cdl Differ. 20:125-162). 
5 Methods wMc^are weU known to those sldUed in the art niay be used to construct 

expression vectors containing polynucleotides encoding IRAP and appropriate transcriptional and 
translational control elements. These methods include in vitro recombinant DNA techniques, 
synthetic techniques, and in vivo graetic recordbination (Sand)rook and Russell, supre^ ch. 1-4, and 
8; Ausubel et al., supra^ ch. 1, 3, and 15). 

10 A variety of e}^ressiQn vector/host systems may be utilized to contain and express 

polynucleotides flooding IRAP. These include, but are not limited to, nucroorganisnos such as 
bacteria transformed with recombinant bacteriophage, plasnud, or cosmid DNA expression vectors; 
yeast transformed with yeast expression vectors; insect cdl systems infected with viral expression 
vectors (e.g., baculowus); plant cdl systems transformed wLfh viral expression vectors (e.g., 

15 cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bactmal expression 
vectors (e.g., Ti or pBR322 plasnuds); or animal cdl systems (Sanoibiook and Russell, supra\ 
Ausubel et al., supra\ Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; 
Engelhard, E.K. et aL (1994) Proc. Nafl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) 
Hum. Gene Ther. 7:1937-1945; Takanaatsu, N. (1987) EMBO J. 6:307-31 1; The McGraw Hill 

20 Yearbook of Science and Technology (1992) McGraw Hill, New Yodc NY, pp. 191-196; Logan, J. 
and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; Harrington, J.J. et al. (1997) Nat. 
Genet 15:345-355). Expression vectors derived fit>m retroviruses, adenoviruses, or herpes or 
vacdnia viruses, or ficom various bacterial plasmids, noi^ be used for delivery of polynucleotides to 
the targeted organ, tissue, or cell population (Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5:350- 

25 356; Yu, M. et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Bidler. R.M. et aL (1985) 

Nature 317:813-815; McGregor, D.P. et al. (1994) Mol. Imraunol. 31:219-226; Vemia, LM. and N. 
Somia (1997) Nature 389:239-242). The invention is not lindted by the host cell einiloyed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intraided for polynucleotides encoding IRAP. For exanq)le, routine cloning, 

30 subcloning, and propagation of polynucleotides encoding IRAP can be achieved using a 

multifuncti0nal E. coli vector such as PBLUESCRIPT (Stratagene, La Jolla CA) or PSPORTl 
plasnoid (Invitrogm). ligation of polynucleotides encoding IRAP into the vector's multiple cloning 
site disrapts the ladL gene, allowing a colorimetric screening procedure for identification of 
transformed bacteria containing recoidbinant molecules. In addition, these vectors may be usefol for 

35 in vitro transoiption, dideoxy sequeuciDg, single strand rescue with helper phage, and creation of 



48 



wo 2004/048550 



PCT/US2003/038178 



.nested ddedons in the cloned seqpence (Van Heeke, G. and S.M. Schuster (1989) J. Biol. Oieta 
264:5503-5509). When large quantities of IRAP are needed, e.g. for the production of antibodies, 
vectors wUdi direct liighle^d expression of IRAP may be used. For exanqde, vectors containing 
the strong, inducible SP6 or T7 bacteciopliage promoter may be used. 

5 Yeast expression systems myay be used for production of IRAP. Anunoiber of vectors 

containing constitutive or indudble promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoris. hx addition, such 
vectors direct either the secretion or intracellular retenjdon of expressed protdns and enable 
integration of foreign polynucleotide sequences into the host genome for stable propagation 

10 (Ausubel et al., supra; Bitter, G.A. et al. (1987) Methods EnzymoL 153:516-544; Scorer, C.A. et aL 
(1994) Bio/Technology 12:181-184). 

PlantsysteoQsmay also be used for expression of IRAP. Transcription of polynucleotides 
mcoding IRAP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
alone or in conabination with the omega leader sequence firomTMV (Takamatsu, N. (1987) EMBO 

15 J. 6:307-31 1). Alternatively, plant promoters such as the small subimit of RUBISCO or heat shock 
promoters may be used (Coruzzi, G. et aL (1984) EMBO J. 3:1671-1680; Broglie, R. et aL (1984) 
Science 224:838-843; Winter, J. et aL (1991) Results Probl. Cell Differ. 17:85-105). These 
constructs can be introduced into plant cells by direct DNA transformatioii or pathogeu-mediated 
transfection (The McCjraw TTffl Vearbook of Science and Technology (1992) McGraw Hill, New 

20 York NY, pp. 191-196). 

In mammalian cells, a number of viral-based e^qpression systems may be utilized. In cases 
"v^ere an adenovirus is used as an expression vector, polynucleotides encoding IRAP maybe ligated 
into an adenovirus transcription/translalion complex consisting of the late promoter and tripartite 
lead^ sequence. Insertion in a non-essential El or E3 region of the viral g»ome may be used to 

25 obtam infective virus vMck expresses IRAP in host ceDs (Logan, J. and T. Shenk (1984) Proc. NatL 
Acad. ScL USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus 
(RS V) enhancer, noay be used to increase expression in mammalian host cells. S V40 or EB V-based 
vectors may also be used for high-level protdn expression. 

Human artificial chromosomes (HACs) may also be enqployed to ddiv^ larg» fragments of 

30 DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
constructed and delivered via conventional delivery methods (liposomes, polycationic anmo 
polymers, or vesicles) for therapeutic purposes (Harrington, J. J. et aL (1997) Nat Genet 15:345- 
355). 

For long term production of recombinant protdms in m^immalian systems, stable expression 
35 of IRAP in ceU lines is preferred. For eKanq>le, polynucleotides ^coding IRAP can be transformed 



49 



wo 2004/048550 



PCTAJS2003/038178 



iBto lines using egression vectors v^ch may contain viral origins of replication aid/or 
endogenous expression elmiBnts and a selectable marker gene on the same or on a separate vector. 
Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days in 
enriclied media before being switched to selective media. The purpose of flie selectable xxmkeac is to 
5 confer resistance to a selective agent, and its presence allows growth and recovery of cells wliich 
successfully express the introduced sequences. Resistant clones of stably transformed cells may be 
propagated using tissue culture techniques appropriate to the cell type. 

Any noidb^ of selection systems may be used to recover transfoxinedcenii^ These 
include, but are not linnted to, the herpes simplex virus thynidine kinase and adenine 

10 phosphoribosyitransferase geoes, for use in tt and (^r cdls, respectively (Wigler, M. et al. (1977) 
Cell 11:223-232; Lowy, L et aL (1980) Cell 22:817-823). Also, antimetabolite, antibiotic, or 
herbidde resistance can be used as the basis for selection. For exan^le, dhfr confers resistance to 
noethotrexate; neo confers resistance to the amiooglycosides neomycin and G-418; and als and pat 
confer resistance to cblorsulfuron and phosphinotcidn acelyltransf erase, respectively (Wigler, M. et 

15 al. (1980) Proc. NatL Acad. Sd. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. MoL BioL 
150:1-14). Additional selectable genes have been described, e.g., trpB and hisDj winch alter cellular 
requirements for metabolites (Hartman, S.C. and R.C. Mulligan (1988) Proc. Natl. Acad. ScL USA 
85:8047-8051). Visible maikiers, e.g., anthocyanins, grem fluorescent protdns (GFP; BD Clontech), 
p-gtucuronidase and its substrate P-glucuromde, or ludferase and its substrate lucifeiia may be used. 

20 These mark^ can be used not oxdy to id^itify transformants, but also to quantify the amount of 
transirat or stable protdn expression attributable to a specific vector system (Rhodes, C. A (1995) 
Methods Mol. BioL 55:121-131). 

Although the pres^ce/absence of marker gene expression suggests that the gene of interest 
is also present, Ihe presence and expression of the gene may need to be confirmed. For exanople, if 

25 the sequence encoding IRAP is ins^ted within a marker gene sequence, transformed cdls containing 
polynucleotides ^coding IRAP can be identified by tbe absmce of marker gene function. 
Alternatively, a marker gene can be placed in tandem with a sequence encoding IRAP under the 
control of a single promoter. Expression of the marker gene in response to induction or sdection 
usually indicates expression of the tandem gene as well. 

30 In general, host cells that contain the polynucleotide encoding IRAP and that express IRAP 

nsaybeidmitifiedbyavariety of procedures known to those of skill in the art These procedures 
include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR amplification, and 
protein bioassay or immunoassay techniques which include menoibrane, solution, or chip based 
technologies for the detection and/or quantification of nucleic acid or protein sequences. 

35 Immunological noethods for detecting and measuring the expression of IRAP using dith^ 



50 



wo 2004/048550 



PCT/US2003/038178 



specific polyclonal or moiiocloiial aiitibodies arc known in the art Examples of such techniques 
include enzyme-linked incnnanosorbent assays (BUS As), radioimmonoassays (RIAs), and 
fluorescence activated cell sorting (PACS). A two-site, inonoclonal-basediiiimmoassayutili^^ 
monoclonal antibodies reactive to two non-int^:fering epitopes on IRAP is preferred, but a 
5 con^^etidve binding assay noay be enoployed. These and other assays are well known in the art 
(Hanopton, R. et al. (1990) Serological Methods, a Laboratory Manual. APS Press, St Paul MN, 
Sect IV; Coligan, J.E. et al. (1997) Current Protocols in TnmniTnnlogv, Greene Pub. Associates and 
^W3gr-Interscience, New York NY; Pound, J.D. (1998) Tmnrintinft heDaical Protocols . Humana Press, 
TotowaNJ). 

10 A wide variety of labels and conjugation tedmiques are known by those skilled in the art and 

naoy be used in various nucleic acid and amino add assays. Means for produdng labeled 
hybruSzation or PCR probes for detecting sequCTces related to polynucleotides encoding TRAP 
i]K:Iude oligolabeling, nick translation, end-labeling, or PCR anq)lification using a labeled 
nucleotide. Alternatively, polynucleotides encoding IRAP, or any fragments thereof, maybe cloned 

15 into a vector for the production of an mRNA probe. Such vectors are known in the art, are 
commercially available, and may be used to synthesize RNA probes in vitro by addition of an 
appropriate RNA polymerase sudi as T7, T3, or SP6 and labeled nucleotides. These procedures may 
be conducted using a variety of commerdalty available kits, such as those provided by Ami^^am 
Biosdences, Promega (Madison WI), and US BiodiCTi.caL Suitable reporter molecules or labels 

20 which may be used for ease of detection include radionuclides, CTzynoes, fluorescent, 

dtemiluTninescent, or duromogenic agCTts, as weU as substrates, cofactors, inhibitors, magnetic 
particles, and the like. 

Host cells transformed witii polynucleotides encoding IRAP may be cultured under 
conditions suitable for the expression and recovery of the protdnfix^mcdlcul^^ Theprotdn 

25 produced by a transformed cell may be secreted or retained intracellularly depending on the 
sequence and/or the vector used. As will be understood by those of skill in the art, expression 
vectors containing polynucleotides whidi encode IRAP may be designed to contain signal sequences 
wfaidi direct secretion of IRAP through a prokaryotic or eukaryotic cell naeinbrane. 

In addition, a host cell strain may be diosen for its ability to modulate expression of tiie 

30 inserted polynucleotides or to process the expressed protdn in the desired fasluo^ Sudi 
modificatioios of the polyp^tide include, but are not limited to, acelylation, carboxylation, 
glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing v^ch 
cleaves a **prepro" or ""pro" form of the protein may also be used to specify protein targeting, folding, 
and/or activity. Different host cells which have specific cellular madiinery and characteristic 

35 medianisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are 

51 



wo 2004/048550 



PCTAJS2003/038178 



avaflable firom the American Type Culture Collection (ATCC, Manassas VA) and may be chosen to 
ensure the correct modification and processing of the foreign protein. 

In another embodiment of the invention, natural, modified, or reccmibinant polynucleotides 
encoding IRAP may be ligated to a hetmlogous sequence resulting in translation of a fusion protdn 
5 in any of the aforementioned host systems. For exanqde, a chimimc IRAP protein containing a 
heterologous moiety that can be recognized by a conamercialfy available antibody may facilitate the 
screCTiDg of peptide libraries for inhibitors of IRAP activity. Het^ologous protdn and peptide 
moieties may also facilitate purification of fusion proteins using commercially available affinity 
matrices. Such moieties include, but are not limited to, ghitathione S-taransferase (GST), maltose 

10 binding protdn (MBP), thioredoxin (Tix), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, 
and hmiagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of thdr cognate 
fusion protdns on immobilized glutathione, maltose, phenylarsine oxide, calmodidin, and metal- 
chdate resins, respectivdy. FLAG, c-myCy and hensagglutinin (HA) oiable immunoaffinity 
purification of fusion protdns using commi^cially available monoclonal and polyclonal antibodies 

15 that specifically recognize these epitope tags. A fusion protdn may also be engine^ed to contain a 
proteolytic cleavage site located between the IRAP encoding sequence and the heterologous protdn 
sequence, so that IRAP may be cleaved away firom the heterologous moiety following purification. 
Methods for fusion protein expression and purification are discussed in AusubeL et al. Qsmpra, ch. 10 
and 1 6). A variety of comm^iaUy available kits may also be used to fadlitate expression and 

20 purification of fusion proteins. 

In another embodim^, synthesis of radiolabeled IRAP may be achieved in vitro using the 
TNT rabbit reticulocyte lysate or ^;^eat gcacm extract system (Promega). These systems couple 
transcription and translation of protein-coding sequences operably associated with the T7, T3, or 
SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for 

25 example, ^^S-methionine. 

IRAP, fragments of IRAP, or variants of IRAP may be used to screen for conq>ounds that 
specifically bind to IRAP. One or more test conq)ounds may be screened for specific binding to 
IRAP. In various embodiments, 1, 2, 3, 4, 5, 10, 20, 50, 100, or 200 test coi[q)ounds can be screened 
for specific binding to IRAP. Examples of test compounds can include antibodies, anticalins, 

30 oligonucleotides, proteins (e.g., ligands or receptors), or small molecules. 

In related embodiments, variants of IRAP can be used to screen for binding of test 
compounds, such as antibodies, to IRAP, a variant of IRAP, or a condbination of IRAP and/or one or 
more variants IRAP. In an emibodiment, a variant of IRAP can be used to screen for compoimds that 
bind to a variant of IRAP, but not to IRAP having the exact sequence of a sequence of SEQ ID 

35 NO: 1-32. IRAP variants used to perform such screening can have a range of about 50% to about 

52 



wo 2004/048550 PCT/US2003/038178 

99% sequence identity to IRAP, with various enaibodiments having 60%, 70%, 75%, 80%, 85%, 
90%, and 95% sequence identity. 

In an embodiment, a compound identified in a screen for specific binding to IRAP can be 
closely related to tiie natural ligand of IRAP, e.g., a ligand or fragment llieieof, a natural substrate, a 

5 structural or functional mimetic, or a natural binding partner (Coligan, J.E. et al. (1991) Current 
Protocols jTi TTTimnTiologv l(2):Cbapter 5). In another enobodiment, the conq>ound thus identified 
can be a natural ligand of a receptor IRAP (Howard, A.D. et al. (2001) Trends Pharmacol. 
ScL22:132-140; Wise, A. et al. (2002) Drug Discovery Today 7:235-246). 

In otiier embodiments, a conqpound identified in a screen for specific binding to IRAP can 

10 be closely related to the natural receptor to which IRAP binds, at least a fragment of the receptor, or 
a fiagment of the receptor including all or a portion of tiie ligand binding site or binding pocket For 
exanqple, ttie compound may be a receptor for IRAP which is capable of propagating a signal, or a 
decoy receptor for TRAP which is not capable of propagating a signal (Ashkenazi, A. and V.M. Divit 
(1999) Curr. Qpin. Cell Biol. 11:255-260; Mantovani, A. et aL (2001) Trends ImmunoL 22:328- 

15 336). The conq)oimd can be rationally designed using known techniques. Exanqiles of such 
techniques ioclude those used to construct the conqiound etanercept (ENBREL; Amgen Inc., 
Thousand Oaks CA), which is efficacious for treating rheumatoid arthritis in humans. Etanercept is 
an en^eered p75 tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human 
IgGi (Taylor, P.C. et al. (2001) Curr. Opin. ImmunoL 13:611-616). 

20 In one enibodiment, two or more antibodies having similar or, alternatively, different 

specificities can be screened for specific binding to IRAP, fragments of IRAP, or variants of IRAP. 
Ihe binding specificity of the antibodies thus screened can thereby be selected to identify particular 
fragments or variants of IRAP. In one enibodiment, an antibody can be selected such that its binding 
spedficity allows for preferential identification of specific firagments or variants of IRAP. In 

25 another erdbodiment, an antibody can be selected such that its binding specificity allows for 

preferential diagnosis of a specific disease or condition having increased, decreased, or otherwise 
abnormal production of IRAP. 

In an embodimuent, anticalins can be screened for specific binding to IRAP, fragments of 
IRAP, or variants of IRAP. Anticalins are ligand-binding proteins that have been constructed based 

30 on a lipocalin scaffold (Weiss, G.A. and H.B. Lowman (2000) Chera Biol. 7:R177-R184; Skerra, A. 
(2001) J. Biotechnol. 74:257-275). The protein architecture of lipocalias can include a beta-barrel 
having eigjit antiparallel beta-strands, which supports four loops at its open end. These loops form 
the natural ligand-binding site of the lipocalins, a site which can be re-engineered in vitro by amino 
acid substitutions to ircqpart novel binding specificities. The amino acid substitutions can be made 

35 using methods known in the art or described herein, and can include conservative substitutions (e.g.. 



53 



wo 2004/048550 PCT/US2003/038178 

sabstitations that do not alter binding specificity) or substitutions that modestly, moderately, or 
significantly alter binding specificity. 

In one embodiment, screening for compounds which specifically bind to, stimulate, or 
inhibit IRAP involves producing appropriate cells which express IRAP, either as a secreted protein 
5 or on the cell membrane. Preferced cells can include cells from mammals, yeast, Drosopliila^ or K 
colL Cells expressing TRAP or cell membrane ficactions which contain IRAP are then contacted with 
a test compound and binding, stimulation, or iiihibition of activily of either IRAP or the coicpound is 
analyzed. 

An assay may simply test binding of a test compound to die polypeptide, wherein binding is 
10 detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For exancqple, 
the assay may coioprise the steps of conibining at least one test conq)oiuid with IRAP, either in 
solution or afBxed to a solid support, and detecting the binding of IRAP to the compound. 
Alternatively, the assay may detect or measure binding of a test conq)oimd in flie presence of a 
labeled competitor. Additionally, the assay may be carded out using cell-free preparations, chemical 
15 libraries, or natural product mixtures, and the test compound(s) may be free in solution or affixed to 
a solid support 

- An assay can be used to assess the ability of a compound to bind to its natural ligand and/or 
to inhibit die binding of its natural ligand to its natural receptors. Exantqdes of such assays include 
radio-labeling assays such as those described in U.S. Patent No. 5,914,236 and U.S. Patent No. 

20 6,372,724. In a related embodiment, one or more amino acid substitutions can be introduced into a 
polypeptide coo^und (such as a receptor) to improve or alter its ability to bind to its natural ligands 
(Matthews, D.J. and J A. Wells. (1994) Ch^n. Biol. 1:25-30). In another related enabodimeot, one 
or more amino acid substitutions can be introduced into a polypeptide compound (such as a ligand) 
to improve or alter its ability to bind to its natural receptors (Cunningham, B.C. and JA. Wells 

25 (1991) Proc. NafL Acad. Sci. USA 88:3407-3411; Lowman, KLB. et al. (1991) J. Biol. Chem. 
266:10982-10988). 

IRAP, fragments of IRAP, or variants of IRAP may be used to screen for conq^ounds that 
modulate the activity of IRAP. Such compounds may include agonists, antagonists, or pardal or 
inverse agonists. In one embodiment, an assay is perfonned under conditions permissive for IRAP 

30 activity, wherein IRAP is cornbined with at least one test conq>ound, and the activity of IRAP in the 
presence of a test confound is conq^ared with the activity of IRAP in the absence of the test 
conipound. A change in the activity of IRAP in the presence of the test compound is indicative of a 
compound that modulates the activity of IRAP. Alternatively, a test confound is combined with an 
in vitro or cell-free system con5)rising IRAP under conditions suitable for IRAP activity, and the 

35 assay is performed. In either of these assays, a test conopound which modulates the activity of IRAP 



54 



wo 2004/048550 PCT/US2003/038178 

may do so ioidirectly and need not come in direct contact with the test confound. At least one and 
up to a phirality of test coinpounds may be screened. 

In anodic enibodinG&nt, polynacleotides encoding niAP or their manamalianliomDlogs may 
be ^Iknocked ont" in an animal noodel systemxising homologous recombination in embryonic ston 
5 (ES)cells. Suchteclmiques areweQlmownintheartandarenseMforthegea^ 

models of hnman disease (see, e.g., U.S. Patent No. 5,175,383 and U.S. Patent No, 5,767,337). For 
example, mouse ES cdls, such as the mouse 129/SvJ cell line, are d^ved fromtiie early mouse 
CTobryo and grown in culture. The ES cells are transformed with a vector containing the geno of 
interest disrupted by a marker gene, e.g., the neorcycin pliosphotransf&rase geoe (neo; Capecchi, 

10 M.R. (1989) Sdence 244:1288-1292). The vector integrates into the corresponding region of the 
bost g^H>me by homologous reconobination. Alternatively, homologous recombination takes place 
using the Cre-loxP system to knockout a gene of interest in a tissue- or devdqpnnsiital stage-specific 
manner (Marfh, J.D. (1996) Clm. Invest. 97:1999-2002; Wagniar, K.U. et al. (1997) Nucledc Acids 
Res. 25:4323-4330). Transformed ES cells are identified and nncroinjected into nxmse cell 

15 blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred 
to pseudopregnant dams, and Ihe resulting chimeric progeny are geootyped and bred to produce 
heterozygous or homozygous strains. Transgenic animals thus generated may be tested with 
potential therapeutic or toxic agents. 

Polynucleotides encoding TRAP may also be manipulated in vitro in ES cells derived from 

20 huxnan blastocysts. Human ES cells have the potential to differentiate into at least dght separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages di^rentiate 
into, for exanople, neural ceDs, hematopoietic lineages, and cardiomyocytes (Thomson, J. A. et aL 
(1998) Science 282:1 145-1 147). 

Polynucleotides encoding IRAP can also be used to create "koocldn" humanized animalR 

25 (pigs) or transgenic animals (mice or rats) to model human disease. With knocldn technology, a 
region of a polynucleotide encoding IRAP is u^ected into animal ES cells, and tiie injected sequence 
integrates into the amnoal cell genome. Transformed cells are ii^ected into blastulae, and the 
blastulae are iaplanted as described above. Transgenic progeny or iiibred lines are studied and 
treated with potential pharmaceutical agents to obtain information on treatment of a human disease. 

30 Alternatively, a mammal inbred to ovearexpress IRAP, e.g., by secreting IRAP in its mQk, noay also 
serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
between regions of IRAP and immune response associated proteins. In addition, examples of tissues 
35 expressing IRAP can be found in Table 6 and can also be found in Example XI. Therefore, IRAP 



55 



wo 2004/048550 



PCTAJS2003/038178 



appears to play a role in immune system, neurological, developmental, muscle, cell proliferative 
disorders, and disord^ of lipid metabolism. In the treatment of disorders associated wifli increased 
IRAP es^ression or activily, it is desirable to decrease the expression or activity of IRAP. In the 
treatment of disorders associated with decreased IRAP expression or activity, it is desirable to 

5 increasetheexpressionor activi^of IRAP. 

Therefore, in one CTobodiment, IRAP or a fragment or derivative thereof may be 
administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of IRAP. Exanqdes of such disorders include, but are not limited to, an immune system 
disorder such as acquired immunodeficieocy syndroms (AIDS), X-linfced agammagtobinemia of 

10 Bmton, common variable immunodeficiency (CVI), DiGeorge's syndrome (ttiyimc hypoplasia), 
thymic <tysplasia, isolated IgA deficiency, severe conibined immunodeficieDcy disease (SCID), 
immunodeficiency with thrornbocytopenia and ecz^ooa (Wiskott-Aldrich syndrome), Chediak- 
HigasM syndrome, chronic granulomatous diseases, hereditary angioneurotic edema, 
immu3QDdeficdmcy associated with Cushing's disease, Addison's disease, adult respiratory distress 

15 syndrome, allergies, ankylosing spondylitis, am^oidosis, anecma, asthma, atherosclerosis, 
autoimmune hemolytic anenna, autounmune thyroiditis, autoimmune polyendocrinopathy- 
candidiasis-ectodermal dystrophy (APECBD), bronchitis, cholecystitis, contact dermatitis, Crohn's 
disease, atopic dermatitis, dermaton^^ositis, diabetes meDitus, en^)hysema, episodic lymphopenia 
withlymphocjrtotoxiDS, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 

20 glomiarulonephritis, Goodpasture's syndrome, gout. Graves' disease, Hashimoto's tiiyroiditis, 
hypereosinophilia, irritable bowel syndrome, nmltiple sclerosis, ni^asthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Rdter's 
syndrome, rheumatoid arthritis, scleroderma, Sj5gren's syndrome, systeoic anaphylaxis, systenuc 
lupus etythematosus, systemic scl^sis, thrombocytopenic purpura, ulcerative colitis, uvdtis, 

25 Wrani^ syndrome, complications of cancer, henoodialysis, and extracorporeal circulation, viral, 

bactedal, fiingal, parasitic, protozoal, andhelninthic infections, and trauma; a neurolo^cal disorder 
such as epilqpsy, ischendc cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease. 
Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyranidal 
disorders, arcyotrophic lateral sclerosis and other motor neuron disorders, progressive neural 

30 muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating 
diseases, bacterial and viral meningitis, brain abscess, subdural eirpyema, epidural abscess, 
suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral cratral nervous system 
disease, prion diseases iacluding kum, Creutzfeldt-Jakob disease, and G^tmann- 
Stcaussler-Scheinker syndrome, fatal fantulial insomnia, nutritional and metabolic diseases of the 

35 n^ous systraa, neurofibromatosis, tuberous sclerosis, c^beBoretinal hemangioblastomatosis. 



56 



wo 2004/048550 



PCT/US2003/038178 



eaiceplialotrigeaimal syodrame, msDtal letardation and othier developioaital disorders of the central 
nervous s3rstCTa.iBcladiiig Down syndrome, cerebral palsy, neuroskeletal disorders, autDnoTDic 
nervous syston disorders, cranial nerve disorders, spinal cord diseases, nniscular dystrophy and 
other neuronoQiscular disorders, periph^^ n^ous system disorders, dermatomyositis and 

5 polynQTositis, inherited, metabolic, ^idocrine, and toxic noyopathies, n^asth^iia gravis, periodic 
paralysis, mental disorders induding mood, anxiety, and schizophrenic disorders, seasonal affective 
disord^ (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, 
paranoid psychoses, postherpetic neuralgia, Toxu:ette*s disorder, progressive supranuclear palsy, 
corticobasal degeneration, and famQial frontoten^oral dementia; a developmental disorder such as 

10 renal tubular acidosis, anenda, Cusfaing's syndrome, achondroplastic dwarfism, Dudhenne and 
Bedjst muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor, 
amxidia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, 
mydoitysplastic syndrome, hereditary muco^ithelial dysplasia, hereditary keratodecmas, hereditary 
neuropathies such as Qiarcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, 

15 hydrocephalus, seizure disordm such as Syndenham*s chorea and c^:ebral palsy, spina bifida, 
an^icephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss; a 
muscle disorder such as cardiomyopathy, myocarditis, Duchenne's muscular dystrophy, Becker's 
muscular dystrophy, niyotonic dystrophy, central core disease, nmialine myopathy, centronuclear 
myopathy, lipid myopathy, mitochondrial myopathy, infectious myositis, polyniyositis, 

20 dermatonoyositis, inclusion body myositis, thyrotoxic myopathy, and ethanol myopathy; and a cell 
prolifimtive disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, 
hepatitis, mixed connective tissue disease (MCTD), nqrelofibrosis, paroxysmal nocturnal 
hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia; cancers including 
adenocardnoma, leukemia, lynqphoma, melanoma, myeloma, sarconoa, teratocarcinoma, and, in 

25 particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, colon, gall 
bladd^, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, 
parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; and a 
disorder of lipid metabolism such as fatty liver, cholestasis, primary biliaiy cinhosis, carnitine 
deficiency, carnitine palmitoyltransferase deficiency, myoadenylate deaminase deficiency, 

30 hypertriglyceridemia, lipid storage disorders such Fabry's disease, Gaucher's disease, Niemann- 
Pick's disease, metachromatic leukodystrophy, adrenoleukodystrophy, GM2 gangliosidosis, and 
ceroid lipofuscinosis, abetalipoproteinemia, Tangier disease, hyperlipoproteinemia, diabetes 
mellitus, lipodystrophy, lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, 
lipoid adrenal hyperplasia, minimal change disease, lipomas, atherosclerosis, hypercholesterolemia, 

35 hypercholesterolemia with hypertriglyceridemia, primary hypoalphalipoproteinemia, 

57 



wo 2004/048550 PCT/US2003/038178 

bypofliyioidism, lenal disease, liver disease, leciflunrcholesterol acyltransferase deficiency, 
ceiebrotendinous xanflxomatosis, sitosteroleniia, hypocholesteiolemia, Tay-Sachs disease, 
Saiidho£Ps disease, hyperliiudeiiiia, bypedipemia, lipid nqropatihies, and obesity. 

In anotber eoibodinieDt, a vector capable of esqpressing IRAP or a firagment or d^vative 

5 tb^eof may be administ^ed to a subject to treat or prevent a disorder associated with decreased 
expression or activity of IRAP including, but not limited to, those described above. 

In a further CTibodimeot, a conqwsitioncoiEpising a substantially purified IRAP in 
coiyunction with a suitable pharmaceutical carrier may be administered to a subject to treat or 
prevent a disorder associated with decreased expression or activity of IRAP including, but not 

10 limited to, those provided above. 

In still another embodiment, an agonist wfaidi modulates the activity of IRAP may be 
administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of IRAP iocludmg, but not limited to, those listed above. 

In a fiirtber eonbodimeot, an antagonist of IRAP may be adnmustered to a subject to treat or 

15 prevent a disorder associated with increased expression or activity of IRAP. Exaioples of such 
disorders include, but are not lindted to, those immune system, neurological, developmental, 
muscle, cell proliferative disord^, and disord^s of lipid metabolism described above. In one 
aspect, an antibody which specifically binds IRAP may be used directly as an antagonist or indirectly 
as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues which 

20 express IRAP. 

In an additional embodiment, a vector expressing the con9)lemBnt of the polynucleotide 
encoding IRAP may be administered to a subject to treat or prevent a disord^ associated with 
increased expression or activity of IRAP including, but not linuted to, those described above. 

In other embodiments, any protein, agonist, antagonist, antibody, conplementary sequwce, 

25 or vector enibodimsnts may be administered in cordbination with other appropriate therapeutic 

agents. Selection of the appropriate agents for use in combination th^apy may be made by one of 
ordinary skill in the art, according to conventional pharmaceutical principles. The combination of 
therapeutic agents may act synergistically to effect the treatment or prevention of the various 
disorders described above. Using this approach, one may be able to achieve therapeutic efficacy 

30 with lower dosages of each agent, thus reducing the potential for adverse side efifects. 

An antagonist of IRAP may be produced using methods which are generally known in the 
art. In particular, purified IRAP may be used to produce antibodies or to screen libraries of 
pharmaceutical agents to identify those which specifically bind IRAP. Antibodies to IRAP may also 
be generated using methods that are well known in the art Such antibodies may include, but are not 

35 limited to, polyclonal, monoclonal, chimeric, and single chain antibodies. Fab firagments, and 



58 



wo 2004/048550 PCT/US2003/038178 

fragaaents produced by a Fab expression library. In an etdbodiniBnt, neutralizing antibodies (i.e., 
those which, inhibit dimer fbrmatiQn) can be used therapeutically. Single chain antibodies (e.g., from 
camels or llamas) may be potent enzyme inhibitors and may have application in the design of peptide 
mimetics, and in the devdLopmimt of immuno-adsorbeots and biosensors (Muyldfflnans, S. (2001) J. 
5 Biotechnol. 74:277-302). 

For the production of antibodies, various hosts including goats, rabbits, rats, nuce, camels, 
dromedaries, llamas, humans, and oth^ may be immunized by iiyecdon with IRAP or with any 
fragment or oligopeptide therepf which has immunogenic prop^es. Depending on the host species, 
various adjuvants may be used to increase immunological response. Snch adjuvants include, but are 
10 not liimted to, Freund's, mineral gds such as aluminum hydroxide, and surface active substances 
such as lysolecithin, pluronic ix>lyols, polyanions, peptides, oil emulsions, KLH, and dudtroplienoL 
Among adjuvants used inhumans, BCG (bacilli Calmette^uerin) and Corynebacterium parvum are 
especially preferable. 

It is preferred that the oligopeptides, pqitides, or fragments used to induce antibodies to 

15 IRAP have an amino acid sequence consisting of at least about 5 amino acids, and generally will 
consist of at least about 10 amino adds. It is also preferable that these oligopeptides, peptides, or 
fragairaits are substantially identical to a portion of the amino acid sequmce of the natural protdn. 
Short stretches of IRAP amino acids nmy be fused with those of another protein, such as KLH, and 
antibodies to the chimeric molecule may be produced. 

20 Monoclonal antibodies to IRAP noay be prepared using any technique which provides for the 

production of antibody molecules by continuous cell lines in culture. These include, but are not 
limited to, the hybridoma technique, tlie human B-ceQ hybridoma technique, and the EBV- 
hybridoma technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. 
ImrauML Methods 81:31-42; Cote, R.J. et al. (1983) Proc. NatL Acad. Sci. USA 80:2026-2030; 

25 Cole, S.P. et al. (1984) MoL Cell BioL 62:109-120). 

In addition, techniques developed for the production of ''chimmc antibodies,** such as the 
splicing of mouse antibody genes to human antibodjr genes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used (Morrison, S.L. et aL (1984) Proc. NatL Acad. 
Sci. USA 81:6851-6855; Neuberger, M-S. et aL (1984) Nature 312:604-608; Takeda, S. et al. (1985) 

30 Nature 314:452-454). Alternatively, techniques described for the production of single ch^ 

antibodies may be adapted, using methods known in the art, to produce IRAP-specific sijogie chain 
antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be 
generated by chain shufBing from random combinatorial immunoglobulin libraries (Burton, D.R. 
(1991) Proc. NatL Acad. ScL USA 88:10134-10137). 

35 Antibodies may also be produced by inducing in vivo production in the lynphocyte 



59 



wo 2004/048550 



PCTAJS2003/038178 



population or by screeoing innuomoglobiilm libraries or pands of bigUy spedfic binding reagents as 
disclosed in the literatnre (Qrlandi, R. et al. (1989) Proc. Nafl. Acad. ScL USA 86:3833-3837; 
Winter, G. et ai (1991) Nature 349:293-299)- 

Antibody fragments \^ch contain spedfic binding sites for IRAP may also be gsoimted. 
5 For exanople, such fragments include, but are not lincdted to, F(ab% fragmsots produced by pepsin 
digestion of the antibody molecule and Fab fragmoots g^ierated by reducing the disulfide bridges of 
the F(aby2 fragmsits. Altmiativdy, Fab expression libraries may be constructed to allow rapid and 
easy identification of monoclonal Fab fragments with the desired specificity (Huse, W.D. et aL 
(1989) Science 246:1275-1281). 

10 Various immunoassays may be used for screening to identify antibodies having the desired 

spedfidty. Nunamus protocols for con:q>etitive binding or immunoradiometric assays using dther 
polyclonal or monoclonal antibodies with established spedfidties are well Imown in the art. Such 
immunoassays typically involve the measurmient of conqdex formation betwe^ IRAP and its 
specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies 

15 reactive to two non-interfering IRAP epitopes is genially used, but a conopetitive binding assay may 
also be employed (Pound, supra). 

Various methods such as Scatchard analysis in coiyunction with radioimmunoassay 
techniques may be used to assess the affinity of antibodies for IRAP. Affinity is expressed as an 
association constant, K., which is defined as the molar concentration of IRAP-antibody complex 

20 divided by the molar concentrations of firee antigen and free antibody under equilibrium conditions. 
The detemuned for a preparation of polyclonal antibodies, vMdh are heterogeneous in their 
affinities for multiple IRAP epitopes, represents the average afSnity, or avidily, of the antibodies for 
IRAP. The detemaned for a preparation of monoclonal antibodies, which are monospecific for a 
particular IRAP epitope, represents a true measure of afSnily. High-afiBnity antibody preparations 

25 with ranging from about 10' to 10^^ L/mole are preferred for use in immunoassays in which the 
IRAP-antibody complex must withstand rigorous manipulations. Low-afBnity antibocty preparations 
with K« ranging from about 10^ to 10^ IVnoole are preened for use in ianamopurificationa^ 
sinmlar procedures which ultimatdy require dissociation of IRAP, preferably in active form, from tiie 
antibody (Catty, D. (1988) Antibodies. Volume I: A Practical Approach. IRL Press, Washington DC; 

30 liddell, J.E. and A Cryer (1991) A Practical Guide to Monoclonal Antibodies. John Wiley & Sons, 
New York NY). 

The titer and avidity of polydonal antibody preparations may be further evaluated to 
determine the quality and suitability of such preparations for certain downstream applications. For 
exanqde, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, 
35 preferably 5-10 ing specific antibody/nal, is gCTieraUy eii5)loyed in procedures requiring precipitation 

60 



wo 2004/048550 



PCT/US2003/038178 



of IRAP-antibody con^lexes. Procedures for evaluating antibody specificity^ titer, and avidity, and 
goiddines for ai^ody quality and usage in various applications, are gcnBscsIlj available (Catty, 
supra; Coligan et aL, supra). 

In another raxibodiment of the invention, polynucleotides encoding IRAP, or any firagnoent or 
5 conq)l@DQent thra:eof , may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing conq^lemeatary sequ^ces or antis^ose molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to tiie coding or regulatory regions of the gene raicoding 
IRAP. Such tecbnology is wdl known in the art, and antisense oligonucleotides or larg^ firagments 
can be designed from various locations along the coding or control regions of sequences encoding 

10 IRAP (Agrawal, S., ed. (1996) Antisense Hierapeutics . Humana Press, Totawa NJ). 

In llierapeutic use, any gene deliv^ system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracelMarly in tiie form of an expression plasmid whicb, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding fixe target protein (Slater, J.E. 

15 et al. (1998) J. Allergy Clin. Immunol. 102:469-475; Scanlon, K.J. et al. (1995) FASEB J. 9:1288- 
1296). Antisense sequences can also be introduced intracellularly through flie use of viral vectors, 
such as retrovirus and adeno-assodated vims vectors (Miller, A.D. (1990) Blood 76:271-278; 
Ausubel et al., supra; Uckert, W. and W. Walttier (1994) Pharmacol. Ther. 63:323-347). CMher gene 
delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other systems 

20 known in the art (Rossi, J J. (1995) Br. Med. BuH. 51 :217-225; Boado, R.J. et al. (1998) J. Phann. 
Sci. 87:1308-1315; Monis, M.C. et al. (1997) Nucleic Adds Res. 25:2730-2736). 

In another eoibodiment of the invention, polynucleotides ^coding IRAP may be used for 
somatic or germiine gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of sevm combined immunodeficiency (SCID)-X1 disease characterized by X- 

25 liiiked inheritance (Cavazzana-Calvo, M. et aL (2000) Science 288:669-672), severe cordbined 
immunodefid^cy syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R,M. et al. (1995) Scimce 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 
cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hmn. Gene 
Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, 

30 familial hypercholesterolemia, and hemophilia resulting firom Factor VUI or Factor IX deficiencies 

(Crystal, R.G. (1995) Science 270:404-410; Verma, I.M. and N. Somia (1997) Nature 389:239-242)), 
(ii) express a conditionally lethal geue product (e.g., in the case of cancers which result from 
unregulated cell proliferation), or (iii) express a protein which affords protection against intraceOular 
parasites (e.g., against hiunan retroviruses, such as human immunodeficiency \drus (HTV) 

35 (Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA 

61 



wo 2004/048550 



PCT/US2003/038178 



93:11395-11399), liq>atitis B or C wus (HBV, HCV); fiingal parasites, such as Candida albicans 
and Paracoccidioides brasiliensis; and protozoan parasites such as Plasmodium falciparum and 
Trypanosoma cruzi). In the case where a genetic defici^icy in IRAP expression or regulation causes 
disease, the expression of IRAP from an appropriate population of transduced cells may aUeviate the 

5 clinical manifestations caused by the genetic deficiency. 

In a furfh^ enibodiment of the invention, diseases or disorders cansedby deficiencies in 
IRAP are treated by constructing mammalian expres sion vectors ^ocodiog IRAP and introducing 
these vectors by mechanical means into IRAP-deficient cells. Mechanical transfer technologies for 
use with cdls m vivo or ex vitro include (i) direct DNA ndcroiiyection into individual cells, (ii) 

10 ballistic gold particle deliv^, (iii) liposome-mediated transfection, (iv) receptor-mediated gene 

transfer, and (v) the use of DNA transposons (Morgan, R. A. and W.F. Anderson (1993) Annii. Rev. 
Biochem. 62:191-217; Ivies, Z. (1997) Cefl 91:501-510; Bqulay, J.-L. and H. R6cipan (1998) Curr. 
Opin. Biotedmol. 9:445-450). 

Expression vectors that may be efiSective for the expression of IRAP include, but are not 

15 Ihmtedto, thePCDNA3.1, EPITAQ, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 

(hivitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), 
and PTET-OFF, PTET-ON, PTRE2, PTRB2-LUC, PTK-HYG (BD Clontech, Palo Alto CA).. IRAP 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), 
Rous sarcoma virus (RS V), S V40 virus, thymidine kinase CTK), or p-actin genes), Qi) an inducible 
• 20 promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. NatL 

Acad, Sci. USA 89:5547-5551; Gossen, M. et aL (1995) Science 268:1766-1769; Rossi, F.M.V. and 
H.M. Blau (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid 
(hivitrogen)); the ecdysone-indudble promoter (available in the plasmids PVGRXR and PIND; 
Ihvitrogen); the FK506/rapanijrdn inducible promoter; or the RU486/mifepristone inducible 

25 promoter (Rossi, F.M.V. and H.M. Blau, supra)), or (iii) a tissue-specific promoter or the native 
promoter of the endogenous gene encoding IRAP froma normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECnON KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experiruental 

30 parameters. In the alternative, transformation is performed using the calcium phosphate noethod 
(Graham, FX. and AuJ. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982) BMBO J. 1 :841-845). The introduction of DNA to primary cells reqpiires modification of 
these standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defscts with 

35 respect to IRAP expression are treated by constmcting a retrovirus vector consisting of (i) the 

62 



wo 2004/048550 



PCT/US2003/038178 



polynucleotide eacodmg IRAP undei^ 

tenninal lepeat (LTR) promote, Qi) appropriate RNA packa^ng signals^ and (iii) a Rev-responsive 
element (RRE) along with additional retroviras cu-acting RNA sequences and coding sequoaces 
required for efficient vector propagation. Retroviras vectors (e.g., PFB and PFBNBO) are 

5 comm^ally available (Stratagene) and are based on published data CRivi^, L et al. (1995) Proc. 
Nafl. Acad, ScL USA 92:6733-6737), incorporated by reference herdn. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropismfor 
repeg/Uxs on the target cells or a pronnscuous envelope protdn such as VSVg (Armentano, D. et aL 
(1987) L Virol. 61:1647-1650; Bender, M.A et al. (1987) J. Virol. 61:1639-1646; Adam, M.A and 

10 AD. Miller (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. ViroL 72:8463-8471; Zufferey, 
JLa^A (1998) J. ViroL 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg ("Method for obtaining 
retroviras packaging cell lines producing high transducing efficimcy retroviral supernatant") 
discloses a niethod for obtaining retroviras packagmg cell lines and is hereby incorporated by 
refoence. Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4* T- 

15 cells), and the return of transduced cells to a patient are procedures well known to persons skilled in 
the art of gene therapy and have bem well docunnaited (Ranga, U. et aL (1997) J. Virol. 71:7020- 
7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; 
Ranga, U. et aL (1998) Proc. NatL Acad. ScL USA 95:1201-1206; Su, L. (1997) Blood 89:2283- 
2290). 

20 In an endbodinoent, an adenovirus-based gene therapy delivery system is used to deliver 

polyoucleotides encoding IRAP to cells whicbhave one or more genetic abnormalities with respect 
' to the egression of IRAP. The construction and packaging of adenovirus-based vectors are well 
known to those with ordinary skin in the art R^lication defective ad^iovirus vectors have proven 
to be versatile for importing genes encoding immunoregulatory protdns into intact islets in the 

25 pancreas (Csete, M.E. et aL (1995) Transplantation 27:263-268). Potentially useful adenoviral 
vectors are described in U,S. Patent No. 5,707,618 to Armentano ("Adenovinis vectors for gene 
therapy**), hraeby incorporated by reference. For adraoviral vectors, see also Antinozzi, P. A et al. 
(1999; Annu. Rev. Nutr. 19:511-544) and Vecma, I.M. and N. Somia (1997; Nature 18:389:239-242). 
hi another eajbodinoent, a hecpes-based, gene therapy delivery system is used to deliv^ 

30 polynucleotides encoding IRAP to target cdls ^AAck have one or noore genetic abnormalities with 
respect to the expression of IRAP, The vise of herpes sircplex virus (HS V)-based vectors may be 
especially valuable for introducing IRAP to cells of Ihe central ni^ous systmi, for which HS V has a 
tropism. The construction and pac^giog of herpes-based vectors are well known to those with 
ordinary sMQ in the art A replication-contQ)eteDt herpes siociplex virus (EISV) type 1-based vector 

35 has been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 

63 



wo 2004/048550 



PCTAJS2003/038178 



169:385-395), Hie constructiQn of a HSV-1 viras vector lias also been disclosed in detail in U-S. 
Pat^ No. 5,804,413 to DeLaca ("Herpes simplex virus strains for gene transfa:"), wiuch is haceby 
incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recoinbinant HS V d92 
which consists of a genome containing at least one oogenous gme to be transfixed to a cell under 

5 the control of the appropriate promoter for purposes including hxmian g»e th^apy. Also taught by 
this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and 
ICP22. For HSV vectors, see also Goins, W.F. et al. (1999; J. Virol 73:519-532) and Xu, H. et al. 
(1994; Dev. BioL 163:152-161). The noanipulation of cloned herpesvirus sequences, the generation 
of recombinant virus following the transfection of multiple plasnuds containing different segments 

10 of Hie large herpesvirus genomes, die growth and propagation of herpesvirus, and the infection of 
cells with hopesvirus areteG]uiiq[ttes wenioouown to fliose of ordinary sldUi^ 

In another embodiment, an alphavirus (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding TRAP to target cdls. The biology of the prototypic alphavirus, 
Semlild Forest Virus (SFV), has been studied extensivdy and gene transfer vectors have been based 

15 on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. BiotechnoL 9:464-469). £>uring 
alphavirus RNA replication, a subgenomic RNA is genearated that normallsr encodes the viral capsid 
proteins. This subgenomic RNA replicates to high^ levels than the full l^igth genonsic RNA, 
resulting in the overproduction of capsid protdns relative to the viral proteins with enzymatic . 
activity (e.g., protease and polymerase). Similarly, inserting the coding sequence for IRAP into the 

20 alphavirus genome in place of the capsid-coding region results in the production of a large number of 
IRAP-coding RNAs and the synthesis of high levels of IRAP in vector transduced cells. While 
alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a 
persistent infection in hamster normal Mdney cells (BHK-21) with a variant of Sindbis virus (SIN) 
indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy 

25 application (Dryga, S.A et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will 
allow the introduction of IRAP into a variety of cell types. The specific transduction of a subset of 
cells in a population may require the sorting of cells prior to transductioiL Tlie methods of 
manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA 
transfections, and perfonxnng alphavirus infections, are well known to those with ordinary skill in 

30 the art 

Oligonucleotides dedvod from the transcription initiation site, e.g., between about positions 
-10 and +10 from the start site, may also be en5)loyed to inhibit gene expression. Similarly, 
inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful 
because it causes inhibition of the ability of the double helix to open suflScienfly for the binding of 
35 polymerases, transcription fectors, or regulatory molecules. Recent therapeutic advances using 



64 



wo 2004/048550 PCTAJS2003/038178 

triplex DNA have been described in tiie lite^ture (Gee, J.E. et al. (1994) in Huber, B.B. and B.L 
Cair, Molecular and Immim nlnpriG Ap proaches . Futura Publishing, ML Kisco NY, pp. 163-177). A 
coixq)lCTaentaiy seqa«ce or antisense molecule may also be designed to block translation of mRNA 
by prev^idng the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA The mechanism of libozyme action involves sequence-specific hybridization of the ribozyme 
noDlecole to conqdementaiy target RNA, followed by endonucleolytic cleavage. For exanople, 
enginemd hammerhead motif ribozyme molecules may specifically and efficientiy catalyze 
endonucleolytic cleavage of RNA molecules encoding IRAP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequCTces: GUA, 
GUU, and GUC. Once identified, short RNA sequraices of between 15 and 20 ribonucleotides, 
corresponding to tiie region of the target gene containing the cleavage site, may be evaluated for 
secondary structural features which may i&oAsr the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with conplementaiy 
oligonucleotides using ribonuclease protection assays. 

Complementaiy ribonucleic acid molecules and libozymes may be prq>ared by air^ method 
known in the art for the synthesis of nucleic acid molecules. These include techniques fi>r 
chenucally synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. 
Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA 
molecules encoding IRAP. Such DNA sequences may be incorporated into a wide variety of vectors 
with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs 
that synthesize caropleaaoesAaiy RNA, constitutively or indu<ably, can be introduced into ceJl lines, 
cells, or tissues. 

RNA inolecules may be modified to increase intracellular stabiUtyaii^ Possible 
modifications include, but are not limited to, tiie addition of flaTiTring sequences at the 5' and/or 3' 
ends of the molecule, or Ihe use of phosphorothioate or 2' O-methyl rather than phosphodiesterase 
linkages within the backbone of the molecule. This concept is inherent in the production of PNAs 
and can be extended in all of these molecules by the inclusion of nontraditional bases such as 
inosine, queosine, and wybutosine, as wdl as acetyl-, methyl-, thio-, and similarly modified forms of 
adenine, cytosine, guanine, thymine, and luracil which are not as easily recognized by endogenous 
endonucleases. 

In other enoibodiTnents of the invention, the expression of one or more selected 
polynucleotides of the present inv^on can be altered, inhibited, decreased, or silenced using RNA 
int^jerence (RNAi) or post-transcriptional g«e silencing (PTGS) methods known in the art RNAi 



65 



wo 2004/048550 PCT/US2003/038178 

is a post-transciiptioDal mode of g»e silraciiig in whicli dcmble-stranded RNA (dsRNA) introduced 
into a targeted cell specifically suppresses ttie expression of the homologoiis gene (Le., &e gene 
bearing the sequmce coicplCTieDtary to the dsRNA). Hiis dSectively knocks out or substantiaUy 
reduces the expression of the targeted gme. PTGS can also be accon:plished by nse of DNA or 
5 DNA fragments as well. RNAi methods are described by Fire, A et al. (1998; Nature 391:806-811) 
and Gura, T. (2000; Nature 404:804-808). PTGS can also be initiated by introduction of a 
conplementary segment of DNA into the selected tissue using gene delivery and/or viral vector 
deliveiy methods described b^rdn or known in the art 

RNAi can be induced in mammalian cells by the use of small interfering RNA also known as 

10 siRNA siRNA are shorts segments of dsRNA (typically about 21 to 23 nucleotides in length) that 
result in vivo from cleavage of introduced dsRN A by the action of an endogenous ribonuclease. 
siRNA appear to be the mediators of the RNAi effect in rnamnoals. Hie most effective siRNAs 
appear to be 21 nucleotide dsRNAs with 2 nucleotide 3' overhangs. The use of siRNA for inducing 
RNAi m mammalian cells is described by Elbashir, S.M. et al. (2001 ; Nature 41 1 :494-498). 

15 siRNA can be generated indirectly by introduction of dsRNA into the targeted cell. 

Alternatively, siRNA can be synthesized directly and introduced ioto a cell by transfection methods 
and agents desc^ed herdn or known in the art (such as liposome-mediated transfection, viral vector 
methods, or other polynucleotide delivery/introductory methods). Suitable siRNAs can be selected 
by exanming a transcript of the target polynucleotide (e.g., mRNA) for nucleotide sequences 

20 downstream from the AUG start codon and recording the occurrence of each nucleotide and the 3' 
adjacent 19 to 23 nucleotides as potential siRNA target sites, with sequences having a 21 nucleotide 
length bdng preferred. Regions to be avoided for target siRNA sites include the 5* and 3* 
untranslated regions (UTRs) and regions near the start codon (within 75 bases), as these may be 
richer in regulatory protdn binding sites. UTR-binding protdns and/or translation initiation 

25 con^lexes may interfere with binding of the siRNP endonuclease complex. The selected target sites 
for siRNA can then be con^ared to the appropriate genome database (e.g., human, etc.) using 
BLAST or other sequence con^arison algorithms known in the art. Target sequences with 
significant hoinology to other coding sequences can be diirdnated from c<^ The selected 

siRNAs can be produced by chemical synthesis methods known in the art or by m vitro transcription 

30 using conmm^ially available methods and kits such as the SILENCER siElNA construction kit 
(Ambion, Austin TX). 

In alternative enibodiments, long-term gene silencing and/or RNAi effects can be induced in 
selected tissue using expression vectors that continuously express siRNA This can be acconoplished 
using expression vectors that are engineered to egress hairpin RNAs (shRNAs) using mediods 
35 known in the art (see, e.g., Brummelkanq), T.R. et al. (2002) Science 296:550-553; and Paddison, 



66 



wo 2004/048550 PCT/US2003/038178 

P. J. et al. (2002) Genes Dev. 16:948-958). In these and rdated enabodkoents, shRNAs can be 
deliveied to target ceDs using expression Vectors knownin the art An esaiiq)le pf a suitable 
expression vector for delivery of siRNA is the PSILENCER1.0-U6 (drcular) plasndd (Anibion). 
Once delivered to the target tissue, shRNAs are processed in vivo into siRNA-lite noolecules capable 

5 of carrying out gene-spedfic sileudng. 

In various CTabodinoeats, the expression levels of genes targeted by RNAi or PTGS noethods 
can be determined by assays for mRNA and/or protdn analysis. Expression levels of the mRNA of a 
targeted g»e can be detennined, for exanq)le, by northern analysis methods using the 
NORTHERNMAX-GLY Mt (Ambion); by nicroarray metbods; by PGR methods; by real time PGR 

10 methods; and by other RNA/polynucleotide assays known in the art or described herein. Expression 
levels of the protein encoded by the targeted gene can be detemoined, for example, by nicroarray 
methods; by polyacrylamide gel electrophoresis; and by Western analysis using standard techniques 
known in the art 

An additional embodiment of the invention encompasses a method for screening for a 

15 compound which is effective in altering expression of a polynucleotide encoding IR Conq)ounds 
which may be effective in altering expression of a specific polynucleotide may include, but are not 
limited to, oligonucleotides, antisense oUgonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which aie capable of interacting with specific polynucleotide sequences. Effective 

20 compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 
polynucleotide exi»ession. Thus, in the treatment of disorders associated with increased IRAP 
expression or activity, a confound which specifically inhibits expression of the polynucleotide 
encoding TRAP may be therapeutically useful, and in flie treatment of disorders associated wifii 
decreased IRAP expression or activity, a conqx>imd which specifically promotes expression of the 

25 polynucleotide encoding IRAP may be therapeutically useM. 

In various enibodiments, one or more test con^xmds may be screened for efiectiveness in 
altering expression of a specific polynucleotide. A test conq>ound may be obtained by any method 
commonly known in the art, including chemical modification of a coinpound known to be effective 
in altering polynucleotide expression; selection from an existing, commercially-available or 

30 proprietary library of naturaUy-occuning or non-natural chemical compounds; rational design of a 
compoimd based on chemical and/or structural properties of the target polynucleotide; and selection 
finom a library of chemical compounds created combinatorially or randomly. A sample conq)rising a 
polynucleotide encoding IRAP is exposed to at least one test compound thus obtained. The sample 
may comprise, for exanq>le, an intact or permeabUized cell, or an in vitro cell-fi:ee or reconstituted 

35 biochenoical system. Alterations in the expression of a polynucleotide encoding IRAP are assayed 



67 



wo 2004/048550 



PCTAJS2003/038178 



by any mefhod commoBly kaown in the art Typically, the egression of a specific nucleotide is 
detected by hybndization with a probe having a nucleotide sequence conDplementaiy to tiie sequence 
of flie polynucleotide encoding IRAP. Hie amoxmt of hybridization may be quantified, thus 
forming the basis for a con9)an5on of die expression of the polynucleotide both with and without 

5 exposure to one or more test confounds. Detection of a change in the expression of a 

polynucleotide exposed to a test conq)ound indicates that &e test conqiound is effective in altenng 
the expression of the polynucleotide. A screen for a con^und effective in altering expression of a 
specific polynucleotide can be carded out, for 6xanq)le, using a Schizosaccharomyces pombe gene 
expression system (Atldns, D. et aL (1999) U.S. Patent No. 5,932,435; Amdt, G.M. et al. C2000) 

10 Nucleic Acids Res. 28:E15) or a human cell line such as HeLa cell (Qadce, M.L. et al. (2000) 
Biochemu Biophys. Res. Conunun 268:8-13). A particular embodimmt of the present invention 
involves screening a combinatonal library of oligonucleotides (such as deoxyiibonucleotides, 
ribonucleotides, peptide nucleic adds, and modified oligonucleotides) for antisense activity against a 
specific polynucleotide sequence (Bruice, T.W. et al. (1997) U.S. Patent No. 5,686,242; Bruice, 

15 T.W. et al. (2000) U.S. Patent No. 6,022,691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
for use in vivo, in vitro^ and ex vivo. For ex vivo tiierapy, vectors may be introduced into stem cells 
tsiksisi from the patient and clonaUy propagated for autologous transplant back iato that same patienL 
Delivery by transfection, by liposome injections, or by polycationic anmno polymers may be 

20 achieved using methods which are well known in the art (Goldman, C.K. et al. (1997) Nat. 
BiotechnoL 15:462-466). 

Any of the therapeutic noethods described above may be applied to any subject in need of 
such therapy, including, for exanq)le, Tnammals such as humans, dogs, cats, cows, horses, rabbits, 
and monkeys. 

25 An additional embodiment of the invention relates to the administration of a composition 

which generally comprises an active ingredient formulated witii a pharmaceuticaUy acceptable 
excipient. Excipients may include, for exanqple, sugars, starches, celluloses, gums, and proteins. 
Various formulations are commonly known and are thoroughly discussed in the latest edition of 
Pftmm ptnn's Pharmaceutical Sciences (Maack Publishing, Easton PA). Such conq)ositions may 

30 consist of IRAP, antibodies to IRAP, and munetics, agonists, antagonists, or inhibitors of IRAP. 

In various enabodiments, the conq)ositions described herein, such as pharmaceutical 
compositions, may be administered by any nimober of routes including, but not limited to, oral, 
intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, pulmonary, 
transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

35 Coiiq)ositions for pulmonary administration may be prepared in liquid or dry powder form. 

68 



wo 2004/048550 PCT/US2003/038178 

Tbese conapositions are generally a^osolized immediately prior to iiihalatiaii by fte patient In flie 
case of small molecules (e.g. traditional low molecular wdgjht organic drugs), aerosol delivery of 
fast-acting formulations is wefl-known in the art In the case of macromolecules (e.g. larger peptides 
and proteins), recent developments in the field of puhnonary delivery via the alveolar region of the 
lung have enabled the practical deliv^ of drugs such as insulin to blood circulation (see, e.g. , 
Patton, J.S. et al„ U.S. Patent No. 5,997,848), Puhnonary ddiyery allows administration without 
needle injectiojo, and obviates the need for potentially toxic penetration enhancers. 

Compositions suitable for use in the invention include conqiositions ^clierein the active 
iiigredioits are contaii^ in an effective amount to achieve the intra^ The detemunation 

of an effective dose is weQ within the capability of those skilled in the art. 

Specialized forms of con^)ositions may be pr^ared for direct intracellular ddiveay of 
macmnolecules coii^risiog IRAP or fragooents ft^ For exanple, liposome preparations 
containing a ceD-inopmneable macromolecule may promote cdDL fusion and intracellular delivery of 
the macromolecule. Alternatively, IRAP or a fragment thereof may be joined to a short cationic N- 
terminal portion from the HIV Tat-1 protdn. Fusion protdns thus g^erated have been found to 
transduce into the cells of all tissues, including the brain, in a noouse model system (Schwarze, S.R. 
et al. (1999) Sdrace 285:1569-1572). 

For any conopound, the therapeutically effective dose can be estimated initially either in cell 
culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, 
monlceys, or pigs. An animal model may also be used to determine the appropriate concentration 
range and route of administration. Suc^ information can then be iised to deternmneusefol doses aiid 
routes for adnmistration in humans. 

A therapeutically effective dose refi^ to that amount of active ingredient, for example IRAP 
or fragments thereof, antibodies of IRAP, and agonists, antagonists or inhibitors of IRAP, which 
ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be determined by 
standard pharmaceutical procedures in cell cultures or with experimental animals , such as by 
calculating the EDso (the dose therapeutically effective in 50% of the population) or LD50 (the dose 
lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
therapeutic index, which can be expressed as the LD50/ED50 ratio. Con^sitions which exhibit large 
therapeutic indices are preferred. The data obtained from ceU culture assays and animal studies are 
used to formulate a range of dosage for hmnan use. The dosage contained in such con^ositions is 
preferably within a range of circulating concentrations that includes the ED50 with littie or no 
toxicity. The dosage varies within this raiige depending upon the dosage form en^loyed, the 
sensitivity of the patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the 



69 



wo 2004/048550 PCT/US2003/038178 

subject requirmg treatmenL Dosage and adniiiiistxatioii are adjusted to provide sufficient levels of 
the active moiety or to maintain the desired ej^ct Factors which may be taken into account include 
the severity of the disease state, the general health of the subject, the age, weight, and gender of the 
subject, time and frequency of administration, drug combination(s), reaction sensitivities, and 
response to therapy. Long-acting concpositions may be admmistered every 3 to 4 days, every week, 
or biweekly depending on the half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0. 1 ^g to 100,000 pig, up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally avaflable to practitioners in the art 
Those skilled in the art will employ different formulations for nucleotides than for proteins or thdr 
inhibitors. Similarly, delivery of polynucleotides or polypeptides win be specific to particular cells, 
conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bind IRAP may be used for the 
diagnosis of disorders characterized by expression of IRAP, or in assays to monitor patients being 
treated with IRAP or agonists, antagonists, or inhibitors of IRAP. Antibodies useful for diagnostic 
purposes may be prepared in the same maimer as described above fer therapeutics. Diagnostic 
assays for IRAP include methods which utilize the antibody and a label to detect IRAP in human 
body fluids or in extracts of cells or tissues. The antibodies may be used with or without 
modification, and may be labeled by covaleot or non-covalent attachment of a r^rter molecule. A 
wide variety of reporter molecules, several of which are described above, are known in the art and 
maybe used. 

A variety of protocols for measuring IRAP, including EUSAs, RIAs, and FACS, are Snown 
in the art and provide a basis for diagnosing altered or abnormal levels of IRAP expression. Normal 
or standard values for IRAP expression are established by conjbining body fluids or cell extracts 
taken fromnormal mammali an subjects, for example, human subjects, with antibodies to IRAP under 
conditions suitable for complex formation. The amount of standard conqdex formation may be 
quantitated by various methods, such as photometric means. Quantities of IRAP esqpressed in 
subject, control, and disease samples frombiopsied tissues are compared vrifh the standard values. 
Deviation betwen standard and subject values establishes the parameters for diagnosing disease. 

In anotb^ embodiment of the invration, polynucleotides encoding IRAP may be used for 
diagnostic purposes. The polynucleotides which may be used include oligonucleotides, 
complementary RNA and DNA molecule, and PNAs. The polynucleotides may be used to detect 
and quantify geoe esspicssiaa in biopsied tissues m which expiession of IRAP may be correlated wifli 
disease. The diagnostic assay may be used to detemine absence, presence, and excess expression of 



70 



wo 2004/048550 



PCT/US2003/038178 



IRAP, and to monitor regulation of IRAP levels during therapeutic intervention. 

In one aspect, hybridization witli PCR probes wliich are capable of detecting 
polynucleotides, includiog genanuc sequmces, ^coding IRAP or closely related molecules may be 
used to identify nucleic acid sequences which mcode IRAP. The specificity of the probe, whether it 
5 is made from a highly specific region, e.g., the 5* regulatory region, or from a less specific region, 
e.g., a conserved motif, and the stringency of the hybridization or an^>lification will detocmine 
whether the probe identifies only naturally occurring sequCTces encoding IRAP, allelic variants, or 
rdated sequmces. 

Probes may also be used for the detection of related sequences, and may have at least 50% 

10 sequence identity to any of the IRAP encoding sequences. Ihe hybridization probes of the subject 
invCTtion may be DNA or RNA and may be derived from the sequence of SEQ ID NO:33-64 or firom 
geuomic sequences including promoters, ^ahancers, and introns of the IRAP gene. 

Means for producing spedfic hybridization probes for polynucleotides encoding IRAP 
include the cloning of polynucleotides encoding IRAP or IRAP derivatives into vectors for the 

15 production of mRNA probes. Such vectors are known in the art, are commimially available, and 
may be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a 
variety of reporter groups, for exanq>le, by radionuclides such as ^^P or ^^S, or by enzymatic labels, 
such as fllValtuR phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

20 Polynucleotides encoding IRAP may be used for the diagnosis of disorders associated witii 

expression of IRAP. Examples of such disorders include, but are not limited to, an inomune system 
disorder such as acquired immunodeficiaicy syndrome (AIDS), X-lihked agammaglobinemia of 
Braton, common variable immunodeficdenc^ (CVT), DiGeorge's syndrome (thymic hypoplasia), 
thymic dysplasia, isolated IgA defici^icy, severe conibined immoonodeficiency disease (SCID), 

25 immunodeficiency with thrombocytopenia and eczema (Wiskott-Aldrich syndrome), Chediak- 
Higashi syndrome, chronic granulomatous diseases, h^^tary angioneurotic edema, 
immunodeficiency associated with Cushing's disease, Addison's disease, adult respiratory distress 
syodronoe, all^gies, ankylosing spondylitis, anoyloidosis, anemia, asthma, athmsclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune polyendociinopathy- 

30 candidiasis-ectodCTnal dystrophy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's 
disease, atopic dermatitis, d^matonqrositis, diabetes mellitus, eoGphysema, qpisodic lyn^hopenia 
with lyncphocytotoxins, erythroblastosis fetalis, ecythema nodosum, atrophic gastritis, 
glomerulonephritis, Goo^asture's syndrome, gout. Graves' disease, Hashimoto's thyroiditis, 
hypeceosinophilia, irritable bowel syndrome, multiple sclerosis, noyasthenia gravis, myocardial or 

35 pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 



71 



wo 2004/048550 



PCT/US2003/038178 



syndrome, rheumatoid arthritis, sclerod^ma, Sjogrea's syndroiQ&, systemic anaphylaxis, systemic 
hipus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
WemM- syndrome, conplications of canc^, heimdialysis, and extracorporeal circulation, viral, 
bacterial, fungal, parasitic, protozoal, andhelndnfhic infections, and trauma; a neurological disord^ 
5 such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheim^*s disease. 
Pick's disease, Hunlington*s disease, dementia, Parkinson's disease and ofh^ extrapyranidal 
disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural 
muscular atrophy, retinitis pigmentosa, h^^tary ataxias, moltiple sclerosis and other demcyelinating 
diseases, bacterial and viral meningitis, brain abscess, subdural eo^yema, epidural abscess, 

10 suppurative intracranial throniboplilebitis, myelitis and radiculitis, viral central nervous sj^tem 
disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and G^tmami- 
Straussl^-Scheinker syndrome, fatal fannlial insomnia, nutritional and metabolic diseases of the 
niorvous system, neurofibromatosis, tuberous sclerosis, cerebeUoretinal hemangioblastomatosis, 
encephalotrigenmnal syndrome, mental retardation and other developmental disorders of the central 

IS h^ous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autcmonnc 
n^ous systCTQL disorders, cranial nerve disord^, spinal cord diseases, muscular dystrophy and 
other neuromuscular disorders, peripheral nervous system disorders, d^matomyositis and 
polymyositis, inherited, metabolic, endocrine, and toxic noyopathies, myasthenia gravis, p^odic 
paralysis, mental disord^ including mood, anxiety, and schizophrenic disorders, seasonal affective 

20 disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, 
paranoid psychoses, posth^etic neural^a, Tourette's disorder, progressive supranuclear palsy, 
corticobasal degeneration, and familial frontotenoporal demiastia; a developmental disorder such as 
renal tubular acidosis, anoma, Cushing's syndrome, achondroplastic dwarfism, Duchenne and 
Becker muscular (tystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor, 

25 aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, 

myelodysplastic syndrome, hereditary mucoqiithelial dysplasia, hereditary keratodermas, hereditary 
neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, 
hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, spina bifida, 
anenc^haly, craniorachischisis, congenital gjlaucoma, cataract, and sensorineural hearing loss; a 

30 muscle disorder such as cardiomyopathy, myocarditis, Duch^3ne*s maiscular dystrophy, Becker's 
noBiscular dystrophy, myotonic dystrophy, central core disease, nemaline noyopathy, centronuclear 
myopathy, lipid myopathy, mitochondrial myopathy, infectious myositis, polymyositis, 
dermatomyositis, inclusion body n^ositis, thyrotoxic noyopalhy, and ethanol myopathy; and a cell 
proliferative disord^ such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, 

35 hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal 



72 



wo 2004/048550 



PCTAJS2003/038178 



hjeimglobimuia, polycyfliCTda v^a, psoriasis, pximaiy thrombocyllieniia; cancers iccludiiig 
adenocarcinoma, leokenua, lynq)]iQim, melanoma, m^loma, sarcoma, teratocarcinoma, and, in 
particular, cancers of &e adrenal gland, bladd^, bone, bone marrow, brain, breast, cervix, colon, gall 
bladder, ganglia, gastrointestinal tract, heart, Iddni^, liver, lung, muscle, ovary, pancreas, 

5 parathyroid, p»is, prostate, salivary glands, skin, sple^ testis, thymus, thyroid, and uterus; and a 
disorder of lipid metabolism such as fatty liver, cholestasis, primary biliary cirdiosis, carnitine 
deficiency, carnitine pahnitoyllxansferase deficiency, myoadenylate deamina'se deficiency, 
hypertriglyc^demia, lipid storage disorders such Fabry*s disease, Gaucher's disease, Niemann- 
Hck's disease, metachromatic leukodystrophy, adrenoleukodystrophy, gangliosidosis, and 

10 ceroid lipofuscinosis, abetalipoproteinemia, Tangier disease, hyperlipoproteinemia, diabetes 

mellitus, lipodystrophy, lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, 
lipoid adrenal hyperplasia, minimal change disease, lipomas, atherosclerosis, hypercholesterolemia, 
hypercholesteroleirda withhypertdglycerideniia, primary hypoalphalipoproteinemia, 
hypothyroidism, renal disease, liver disease, lecithin:cholesterol acyltransferase defidency, 

15 cerebrotendinous xanthomatosis, sitosterolemia, hypocholesterolemia, Tay-Sachs disease, 

SandhofTs disease, hyperlipidemia, hyperlipemia, lipid myopathies, and obesity. . Polynucleotides 
encoding IRAP may be used in Soutibiem or northem analysis, dot blot, or other menibran&based 
technologies; in PCR technologies; in dipstick, pin, and multi£Drmat ELIS A-like assays; and in 
imcroarrays utilizing fluids or tissues from pati^ts to detect altered IRAP expressioiL Such 

20 qualitative or quantitative methods are weQ known in the art 

In a particular enobodiment, polynucleotides encoding IRAP may be used in assays that 
detect the presence of associated disorders, particularly those mentioned above. Polynucleotides 
conqplementary to sequences encoding IRAP may be labeled by standard methods and added to a 
fluid or tissue sample from a patient under conditions suitable for the formation of hybridization 

25 conplexes. After a suitable incubation period, the san^le is washed and the signal is quantified and 
con^ared widi a standard value. If the amount of signal in the patient sample is signiScantly altered 
in con^aiison to a control san^le then the presence of altered levels of polynucleotides encoding 
IRAP in the san^le indicates the presence of the associated disorder. Such assays may also be used 
to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in climcal 

30 trials, or to monitor the treatment of an individual pati^ 

In order to provide a basis for the diagnosis of a disorder associated with expression of 
IRAP, a normal or standard profile for expression is established. This may be acconopUshed by 
conibining body fluids or cell extracts taken jfrom normal subjects, either animal or human, with a 
sequence, or a fragment thereof, encoding IRAP, under conditions suitable for hybridization or 

35 an^lification. Standard hybridization may be quantified by corcqparing the values obtained from 

73 



wo 2004/048550 



PCT/US2003/038178 



normal subjects with values from an experiment in which a known axnonnt of a substantia]]^ purified 
polynucleotide is used Standard values obtaiiied in this noannj^ may be con^ared with vahies 
obtained from san^les from patients who are synq)tomatic for a disoid^. Deviation from standard 
values is used to establish the presence of a disord^. 
5 Once the presence of a disorder is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to detmnine if the level of esq^ression in the 
pati^ begins to approximate that ^^ch is observed in the normal subject. The results obtained 
from successive assays may be used to show the ef&cacy of treatment over a period ranging from 
several days to monflis. 

10 Wifli respect to canc^, the presCTce of an abnormal amount of transcript (dther under- or 

overexpressed) in biopsied tissue from an individual may indicate a predisposition for the 
development of the disease, or may provide a means for detecting the disease prior to the appearance 
of actual clinical synq>toms. A moro definitive diagnosis of this type may allow health professionals 
to enq>loy preventative measures or aggrei^sive treatment earli^, thereby prevrating the development 

15 or further progression of the cancer* 

Additional diagnostic uses for oligonucleotides designed from the sequ^ices encoding IRAP 
may involve the use of PCR. These oligom^ may be chmncally synthesized, generated 
enzymatically, or produced in vitro. Oligomers win preferably contain a fragment of a 
polynucleotide encoding IRAP, or a fragment of a polynucleotide conoplementary to the 

20 polynucleotide encoding IRAP, and will be »Eq>loyed under optinoized conditions for identification 
of a specific gene or condition. Oligomers may also be ^Doployed under less stringent conditions for 
detection or quantification of closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from polynucleotides encoding IRAP 
may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions 

25 andddetions that are a frequent caiise of inherited or acquired genetic disra^ Methods 
of SNP detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) 
and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from 
polynucleotides encoding IRAP are used to an^lify DNA using the polymerase chain reaction 
(PCR). The DNA may be dmved, for example, fi^m diseased or normal tissue, biopsy sanq>les, 

30 bodily fluids, and the like. SNPs in the DNA cause differmces in the secondary and tertiary 

structures of PCR products in single-stranded form, and these differences are detectable using gel 
electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are fluoiescenfly 
labeled, which allows detection of the an^limers inhigh-fbroughput equipment such as DNA 
sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP 

35 (isSNP), are capable of identifying polymorphisms by conoparing the sequeoce of individual 



74 



wo 2004/048550 



PCTAJS2003/038178 



ovedappiiig DNA fragments which assCTdble into a common consensus sequence. These con^uter- 
based methods filter out segu^ice vanatLons due to laboratoiy pieparation of DNA and sequencing 
^Toxs using statistical modds and automated analyses of DNA sequence chromatogcams. In the 
alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the 

5 high throughput MASS ARRAY system (Sequenom, Inc., San Diego CA). 

SNPs may be used to study the genetic basis of human disease. For exaii]ple, at least 16 
common SNPs have been associated with non-insulin-dqpendent diabetes mellitus. SNPs are also 
us^id for examining diEfeiences in disease outconoes in monDgenic disorders, such as cystic fibrosis, 
sidde cell anenda, or chronic granulomatous disease. For example, variants in the mamiose-binding 

10 lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic 
fifaEO»&. SNPs also have utility in pharmacogenomics, the identification of genetic variants that 
influence a patient's response to a drug, such as life-threat^iing toxicity. For exanqple, a variation in 
N-acetyl transferase is associated with a high incidence of peripheral neuropathy in response to the 
anti-tuberculosis drug isoniazid, while a variation in the core promote of the ALOXS gene results in 

15 diminished clinical response to treatment with an anti-asthma drug that targets the 5-lipoxygenase 
pathway. Analysis of the distribution of SNPs in different populations is useful for investigating 
g^ietic drift, mutation, recoidbination, and selection, as well as for tracing the origins of populations 
and didr nugrations (Taylor, J.G. et aL (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. and Z. Gu 
(1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr. Opin. Neurobiol. 11:637-641). 

20 Methods which may also be used to quantify the expression of TRAP include radiolabeling or 

biotirylating nucleotides, coanoplification of a control nucleic add, and interpolating results from 
standard curves (Melby, P.C. et al. (1993) J. Immunol. Methods 159:235-244; Duplaa, C. et al. 
(1993) Anal. Biochem. 212:229-236). The speed of quantitation of multiple samples may be 
accelearated by running the assay in a high-throughput format where the oligomer or polynucleotide 

25 of interest is presented in various dilutions and a spectrophotometric or colorimetric response gives 
rapid quantitation. 

In further enobodimi^its, oligonucleotides or longer fragooents derived from any of the 
polynucleotides described herean may be used as eleanents on a ndcroarray. The microarray can be 
used in transcript imaging techniques which noonitor the rdative expression levels of large numbers 

30 of genes simultaneously as described below. The microarray noay also be used to identifsr genetic 
variants, mutations, and polymorphisms. This information may be used to determine gene function, 
to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
progression/regression of disease as a function of gene expression, and to develop and monitor the 
activities of therapeutic agents in the treatment of disease. In particular, this information may be 

35 used to develop a pharmacogenomic profile of a patient in order to select the most appropriate and 

75 



wo 2004/048550 



PCT/US2003/038178 



effective treatment regimen for that patient For exan5)le, therapeutic agents which are highly 
effective and display the fewest side effects mxy be selected for a patient based on his/her 
pharmacogenomic profile. 

In another embodinoent, IRAP, fragments of IRAP, or antibodies spedfic for IRAP maybe 
5 used as elCToents on a microairay. The rbicroarray may be used to monitor or measure protdn- 
protein interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular raobodiment relates to the use of the polynucleotides of the present inv^tion to 
gen^ate a transcript image of a tissue or cell type. 'A transcript image represrats tiie global pattern 
of g^ expression by a particular tissue or cell type. Global gene expression patterns are analyzed 

10 by quantifying the nunober of expressed genss and thdr relative abimdance under given conditions 
and at a givm time (Sdlhamer et aL, "Comparative Gene Transcript Analysis," U.S. Patent No. 
5,840,484; h^eby expressly incorporated by reference herein). Thus a transcript image may be 
generated by hybridizing the polynucleotides of the pres^ invention or their conq)lements to the 
totality of transcripts or reverse transcripts of a particular tissue or cell type. In one mibodiment, the 

15 hybridization takes place in hi^-througiqmt format, wherdn the polynucleotides of the present 
invention or their concplements conoprise a subset of a plurality of elements on a microarray. Hie 
resultant transcript image would provide a profile of gene activity. 

Transcript images may be gen^ted using transcripts isolated firom tissues, cell lines, 
biopsies, or othra: biological san[q)les. The transcript image naay thus reflect gene expression in vivo, 

20 as in the case of a tissue or biopsy saiqple, or in vitro, as in the case of a cell line. 

Transcript images which profile the expression of the polynucleotides of the present 
inv^ition may also be used in conjunction with in vitro naodd systems and preclinical evaluation of 
pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental 
compounds. AH corqpounds induce characteristic gene expression patterns, firequ^titty termed 

25 molecular fingerprints or toxicant signatures, which are indicative of mechanisns of action and 
toxicity (Nuwaysir, E.F. et al. (1999) Mol. Carchiog. 24:153-139; Stdner, S. and N.L. Anderson 
(2000) Toxicol. Lett 1 12-1 13:467-471). If a test compound has a signature similar to that of a 
compound with known toxicity, it is Ukely to share tiiose toxic properties. Hiese fingerprints or 
signatures are most useful and refined whm they contain expression information firom a large 

30 numb^ of genes and gene families. Ideally, a genome-wide measuremi^ of expression provides the 
highest quality signature. Ev&i genes w4iose expression is not altered by any tested conq)ounds are 
inoportant as well, as the levels of expression of these genes are used to normalize the rest of the 
expression data. Thenornctalizationprocedureisusefulforcorcparison of expression 
treatnoent with different con9>ounds. TVhfletheassignnoentof gene function to elenoents of a 

35 toxicant signature aids in interpretation of toxicity TnenhaTiisTng, knowledge of g^ie flmction is not 

76 



wo 2004/048550 PCT/US2003/038178 

necessary for the statistical roatchmg of signatures which leads to prediction of toxicity (see, for 
exan^le. Press Release 00-02 from the National Institute of EnvirDDmeiital Health Sciences, released 
February 29, 2000, available at nidis.niLgov/oc/news/toxchipJitn^. Therefore, it is in^iortant and 
desirable in toxicologtcal screening using toxicant signatures to include all expressed geoe 
5 sequences. 

In an ecbbodiment, the toxicity of a test conq)oiind can be assessed by treating a biological 
samjde containing nucleic acids wifli the test conqK>iuid. Nucleic acids tiiat are expressed in the 
treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of die present 

10 invention may be quantified. Hie transcript levels in the treated biological sample are compared 
with levels in an xintreated biological sample. Differences in the transcript levels between the two 
samples are indicative of a toxic response caused by &e test conq>ound in the treated sanqple. 

Another enibodimmt relates to the use of the polypeptides disclosed herein to analyze the 
proteome of a tissue or cell tjrpe. The termproteome refers to the global pattern of protdn 

15 expression in a particular tissue or cell type. Each protein caasponsast of a proteome can be 

subjected individually to further analysis. Proteome expression patterns, or profiles, are analyzed by 
quantifying the number of expressed proteins and their relative abundance under given conditions 
and at a given lim. Aprofile of a cdl's proteome may thus be generated by separating and 
analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is 

20 achieved usiug two-dimensional gel electrophoresis, in which protdbos firom a saniple are sq>arated 
by isoelectric focusing in the first dimension, and then according to molecular weight by sodium 
dodecyl sulfate slab gel electrophoresis in the second dimension (Steins and Anderson, supra). The 
proteins are visualized in the gel as discrete and uniquely positioned spots, typically by staining the 
gel with an agent such as Coomassie Blue or sflver or fluorescent stains. The optical density of each 

25 protdn spot is generally proportional to the leveL of the protdn in the sanople. Tbe optical densities 
of equivalenfly positioned protein spots from different samples, for «anq>le, ficom biological 
samples dther treated or untreated with a test compound or therapeutic agent, are compared to 
identify any changes in protein spot density rdated to the treatnii^ The protdns in the spots are 
partially sequenced using, for exanqfle, standard methods employing chenoical or enzymatic cleavage 

30 followed by mass spectrometry. The identity of the protdn in a spot inay be determioed by 
comparing its partial sequence, preferably of at least S contiguous amino acid residues, to the 
polypeptide sequences of interest In some cases, further sequence data may be obtained for 
definitive protein identification. 

A proteoxmc profile may also be generated using antibodies specific for IRAP to quantify the 

35 levels of IRAP expressioiL In one embodiment, the antibodies are used as etenomts on a nicroarray. 



77 



wo 2004/048550 



PCTAJS2003/038178 



and protdn eKpression levels are qaaotified by cQntacting the nucroarray with the saiqile and 
detecting the levels of protein bound to each array elean^ (Laeldng* A. et al. (1999) Anal. Biocbem. 
270:103-111; Mraidoze, L.G. et al. (1999) BiotecOmiques 27:778-788). Detection may be performed 
by a varied of metbods known in the art, for exan^ple, by reacting flie protdns ia the sanople with a 
5 thiol- or amino-reactive fluoresce con9)oiind and detecting the aiiK>unt of fluorescence bound at 
each array element 

Toxicant signatures at the proteome levd are also useful for toxicolo^cal screening, and 
shouldbeanalyzediaparaUel witii toxicant signatures at the transcri^ Th^isapoor 
correlation between transcript and protein abundances for some protdns in sonoe tissues (Anderson, 
10 N.L. and J. Sdlhamf^- (1997) Electropboresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of conqpounds wbicb do not significantly affect the transoipt image, but vftnch 
alter the proteonic profile. In addition, the analysis of transcripts in body fluids is difficult, due to 
rapid degradation of niElNA, so proteomic profiling may be more reliable and informative in sacb 
cases. 

15 In another enibodimBnt, the toxici^ of a test compound is assessed by treating a biological 

sanople containing proteins with the test confound. Protdns that are expressed in the treated 
biological sample are separated so that tiie amount of each protdn can be quantified. The amoimt of 
each proton is compared to the amoimt of the corresponding protdn in an untreated biolo^cal 
sanq>le. A difference in the anx>unt of protdn between the two sanqples is indicative of a toxic 

20 response to the test conopound in the treated sample. Individual protdns are idraitified by sequencing 
the amino acid residues of the individual protdns and conqparing these partial sequences to the 
polypeptides of the present invention. 

. In another ^nbodiment, tbe toxicity of a test compound is assessed by treating a biological 
san9)le containing protdns with the test compoiind. Protdns from the biological sample are 

25 incubated with antibodies specific to the polypeptides of the present invm^ Theamountof 
protein recognized by the antibodies is quantified. The amount of protein in the treated biological 
sarcple is con^ared with the anoount in an untreated biological sano^ A difference in the amount 
of protein betwera the two sancples is indicative of a toxic response to the test coirpo\md in the 
treated sanople. 

30 Microarrays may be prepared, used, and anatyzediising methods Imown in tiie art (Brennan, 

T.M. et aL (1995) U.S. Patent No. 5,474,796; Schraa, M. et al. (1996) Proc. Natl. Acad. Sci. USA 
93:10614-10619; Baldescbweiler et al. (1995) PCT appHcation W095/251 16; Shalon, D. et aL 
(1995) PCT appHcation WO95/35505; Heller, R.A et al. (1997) Proc. NatL Acad. ScL USA 
94:2150-2155; Heller, M.J. et aL (1997) U.S. Patent No. 5,605,662). Various types of microarrays 

35 are well known and thoroughly described in Schma, M., ed. (1999; DNA Microarrays: A Practical 

78 



wo 2004/048550 PCT/US2003/038178 

A pproaclk Oxford University Press, London). 

In another embodiment of the invention, nucl^c acid sequences eiicoding IRAP may be used 
to generate hybridization probes useful in mapping the naturally occurring genonuc sequence. Hdier 
coding or noncoding sequences may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among menibers 
of a multi-gene family may potentially cause undesired cross hybridization during chromosomal 
mapping. The sequences may be mapped to a particidarc^omosome, to a specific 
chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), 
yeast artifidal chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial Pl 
constructions, or smgle chromosome cDNA libraries (Harrington, J. J. et aL (1997) Nat Genet 
15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; Trask, B.J. (1991) Trmds Genet 7:149-154). 
Once mapped, the nucleic acid sequences may be used to develop genetic linkage maps, for exanq>le, 
which correlate the inheritance of a disease state with the inheritance of a particular chromosome 
region or restriction fragment lengfli polymorphism (RFLP) (Lander, RS. and D. Botstein (1986) 
Proc. Nafl. Acad. Sci. USA 83:7353-7357). 

Fluorescent in situ hybridization (FESH^ may be correlated with other physical and genetic 
map data (Heinz-Uhich, et aL (1995) in Meyers, supra, pp. 965-968). Exan5)les of genetic map data 
can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) 
World Wide Web site. Correlation betwem the location of the gene encoding IRAP on a physical 
map and a spedfic disorder, or a predisposition to a specific disorder, may help define the region of 
DNA associated with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mai>ping techniques, such as 
linkage anatysis using established chromosomal markers, may be used for extending genetic maps. 
OftmtheplacemCTt of a gene on the chromosome of another mammalian spedies, such as mouse, 
naay reveal associated markers even if the exact chromosomal locus is not known. This information 
is vahiable to investigators searching for disease genes using positional cloning or other gene 
discov^ techniques. Once the gene or genes responsible for a disease or syndrome have been 
cmdely localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to 
1 lq22-23, any sequences mapping to that area may represent associated or regulatory genes for 
further investigation (Gatti, R.A. et al. (1988) Nature 336:577-580). The nucleotide sequence of the 
instant inv^on may also be used to detect differences in the chromosomal location due to 
translocation, inversion, etc., among normal, carri^, or affected individuals. 

In another enobodiment of the invention, IRAP, its catalytic or immunogCTic fragments, or 
oligopeptides thereof can be used far screening libraries of conq)ounds in any of a variety of drag 
screening techniques. The firagmentCTi^Ioyed in such screemngioay be free in solution, af^ 



79 



wo 2004/048550 PCTAJS2003/038178 

solid support, borne on a cell surface, or located intracellularly. Hie fbrmatLon of bioding conplexes 
between IRAP and the agent bdug tested may be measured. 

Ano&er technique for drug screranng provides for Mgh fhrougiqmt screening of componnds 
having suitable binding afBnity to die protdn of interest (Geysen, et aL (1984) PCT application 
5 WO84/03S64). In this mefliod, large nund>ers of difieientsn^ 

solid substrate. Hie test conq)Giunds are reacted with IRAP, or fragments thereof, and washed. 
Bound TRAP is then detected by methods well known in the art Purified IRAP can also be coated 
directly onto plates for use in the aforementioned drug screening techniques. Altecnativdy, 
non-neutralizing antibodies can be used to capture the peptide and immobilize it on a solid support 
10 In another CTobodiment, one may use competitive dmg screening assays in which 

neutralizing antibodies capable of binding IRAP specifically compete with a test compound for 
binding IRAP. In this noanner, antibodies can be used to detect the presence of any p^tide which 
shares one or more antigenic determinants with IRAP. 

In additional endbodinoents, the nucleotide sequences which encode IRAP may be used in 
15 any molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences diat are currrady known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair int^actions. 

Without furdier elaboration, it is believed that one skilled in tbe art can, using the preceding 
description, utilize the present invention to its fullest extent. The following embodiments are, 
20 therefore, to be construed as merely illustrative, and not limitative of die remainder of the disclosure 
in any way whatsoever. 

The disclosures of all patents, applications, and publications mentioned above and below, 
including U.S. Ser, No. 60/429,442, U.S. Ser. No. 60/429,839, U.S. Ser. No. 60/439,946 and U.S. 
Ser. No. 60/446,182, are hereby expressly incorporated by reference. 

25 

EXAMPLES 

I. Construction of cDNA Libraries 

Incyte cDNAs are derived from cDNA libraries described in the LIFESEQ database (Incyte, 
Palo Alto CA). Some tissues are homogenized and lysed in guanidinium isothiocyanate, while 
30 others are homog^iized and lysed in phenol or in a suitable mixture of denaturants, such as TRIZOL 
(Invitrogen), a monophasic solution of phenol and guanidine isothiocyanate. The resulting lysates 
are centrifuged over CsCl cushions or extracted with chloroform RNA is precipitated from the 
lysates with either isopropanol or sodium acetate and ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA are repeated as necessary to increase RNA 
35 purity. In some cases, RNA is treated with DNase. For most libraries, poly(A)+ RNA is isolated 

80 



wo 2004/048550 



PCT/US2003/038178 



using oKgo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, 
Chatsworth CA), or an OUGOTEX noRNA purification kit (QIAGEN). Alternatively, RNA is 
isolated directly from tissue lysates using oflier RNA isolation Mts, e.g., the POLY(A)PURE mRNA 
purification Idt (Anibion, Austin TX). 
5 In some cases, Stratagene is provided with RNA and cooostructs the corresponding cDNA 

libraries. Otherwise, cDNA is synthesized and cDNA libraries are constructed with the UNEAP 
vector system (Stratag€ffle) or SUPERSCRIPT plasmid system (Invitrogem), using the recommended 
procedures or similar mefhods known in the art (Ausubd dt aL, supra, ch. 5). Reverse transcription 
is initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters are ligated to 

10 double stranded cDNA, and the cDNA is digested with the appropriate restriction enzyme or 

ens^mes. For most libraries, the cDNA is size-sdected (300-1000 bp) using SEPHACRYL S 1000, 
SEPHAROSE CL2B, or SEPHAROSB CL4B column chromatography (AmershamBiosciCTces) or 
prqiarative agarose gd electrophoresis. cDNAs are ligated into con9)atible restriction en^moe sites 
of the poljdinker of a suitable plasndd, e.g,, PBLUBSCRIPT plasnrid (Stratagene), PSPORTl 

15 plasmid (Invitrogen, Carlsbad CA), PCDNA2. 1 plasndd (Invitrogen), PBK-CMV plasmid 

(Stratagene), PCR2-TOPOTA plasmid (Invitrogen), PCMV-ICIS plasmid (Stratagene), pIGEN 
(Incyte, Palo Alto CA), pRARE (Incyte), or pINCY (Incyte), or derivatives thereof. Recombinant 
plasmids are transformed into coaipete^ K coli cells including XLl-Blue, XLl-BlueMRF, or SOLR 
from Stratagrae or DHSa, DHIOB, or ElectroMAX DHIOB from Invitrogen. 

20 n. Isolation of cDNA Clones 

Plasmids obtained as described in Example I are recov^^ firomliost cells by in vivo 
exdsion using the UNIZAP vector syst^ (Stratagene) or by cell lysis. Plasnnds are purified using 
at least one of the following: a Magic or WIZARD Miniprq>s DNA purification system (Promega); 
an AGTC Midprq) purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 

25 Plasmid, QIAWELL 8 Phis Plasmid, QIAWELL 8 Ultra Hasmid purification systems or the 

R.E.A.L. PREP 96 plasmid purification kit from QIAGEN. Following precipitation, plasmids are 
resuspended in 0.1 rd of distilled water and stored, with or without lyopbilization, at 4'*C. 

Alternatively, plasmid DNA is anqilified fromhost cell lysates using direct linV PCR in a 
high-throughput format (Rao, V.B. (1994) AnaL Biochem. 216:1-14). Host cdl lysis and fliermal 

30 cycling steps are carried out in a single reaction mixture. Sanqfles are processed and stored in 384- 
well plates, and the concentration of anqilified plasmid DNA is quantified fluorometrically using 
PICOGREEN dye (Molecular Piobes, Eugene OR) and a FLUOROSKAN n fluorescence scanner 
(Labsystems Oy, Helsinki, Finland), 
in. Sequencing and Analysis 

35 Incyte cDNA recovered in plasmids as described in Example n are sequenced as follows. 

81 



wo 2004/048550 



PCTAJS2003/038178 



Sequeacing leactions axe processed using standaid mefhods or high-throughput instrumeatalion such 
as flie ABI CATALYST 800 (^^jplied Biosystens) thermal cycler or the PTC-200 thecmal cycler 
(MJ Research) in conjunction with the HYDRA microdispenser (Rohbins Scientific) or the 
MICROLAB 2200 (Hanailton) liquid transfer system. cDNA seqamdng reactions aie prepared 

5 using reagents provided by Amersham Biosciences or supplied in ABI sequencing kits such as the 
ABI PRISM BIGDYB Terminator cycle seqa^idxig ready reaction kit (Applied Biosystems). 
Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 
are carded out using the MEGABACE 1000 DNA sequencing system (Anoersham Biosciences); the 
ABI PRISM 373 or 377 sequencing ^stem (Applied Biosystmis) in coigunction with standard ABI 

10 protocols and base caning software; or other sequence analysis systems known in the art Reading 
frames within the cDNA sequ^ices are identified using standard methods (Ausubel et al., supra^ ch. 
7). Some of the cDNA sequences are sdected for extension using the techniques disclosed in 
Example Vin. 

Polynucleotide sequences dmved from Incyte cDN As are validated by removing vector, 

IS linker, and poly(A) sequences and by mafilring ambiguous bases, using algorithms and programs 
based on BLAST, dynanuc programming, and dinucleotide nearest ndghbor analysis. The Incri^ 
cDNA sequences or translations thereof are then queried against a selection of public databases such 
as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and BLOCKS, 
PRINTS, DOMO, PRODOM; PROTEOME databases with sequences from Homo sapiens, Rattus 

20 noTvegicus, Mus nmsculus, Caenorhabditis elegans, Saccharomyces cerevisiae, 

Schizosaccharomyces pombe, and Candida albicans (Incyte, Palo Alto CA); hidden Markov modd 
(HMM)-based protein family databases such as PFAM, INCY, and TIGRFAM (Haft, D.H. et al. 
(2001) Nucldc Acids Res. 29:41-43); and HMM-based protdba domain databases such as SMART 
(Schnltz, J. et aL (1998) Proc. Natl. Acad. Sd. USA 95:5857-5864; Letunic, L et al. (2002) Nucleic 

25 Acids Res. 30:242-244). (HMM is a probabilistic approach which analyzes consensus primary 

structures of gene families; see, for example, Eddy, S.R. (1996) Curr. Opin. Stract Biol. 6:361-365.) 
The queries are performed using programs based on BLAST, PASTA, BLIMPS, and HMMER. The 
Incyte cDNA sequ^ices are assembled to produce fall length polynucleotide sequences. 
Alternatively, GenBank cDNAs, GenBaaok BSTs, stitched sequences, stretched sequences, or 

30 Genscan-predictcd coding sequences (see Exanq>les IV and V) are used to extend Incyte cDNA 
assenoblages to full length. Assembly is perfonned using programs based on Phred, Phrap, and 
Consed, and cDNA assenoblages are screened for open reading firanoes using programs based on 
GeneMark, BLAST, and PASTA The full length polynucleotide sequences are translated to derive 
the corresponding full length polypeptide sequences. Alternatively, a polypeptide may begin at any 

35 of the methionine residues of the full length translated polypeptide. Full length polypeptide 

82 



wo 2004/048550 



PCT/US2003/038178 



sequences are subseqaenfly analyzed by querying against databases such as the GenBank protein 
databases (genpept), SwissProt, the PROTEOMB databases, BLOCKS, PRINTS, DOMO, 
PRODOM, Prosite, hidden Marfcov model (HMM>based protdn fiEinuly databases such as PFAM, 
INCY, and TIGRFAM; and HMM-based protdn domain databases such as SMART. FUl leogth 
5 polynucleotide sequences are also analyzed using MACDNASIS PRO software CMSraiBio, Alameda 
CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence 
alignmeDts are generated using default parameters specified by the CLUSTAL algorithm as 
mcorporated into the MEGAUGN nniltisequence alignmfint program (DNASTAR), which also 
calculates the percent identity between aligned sequences. 
10 Table 7 summarizes tools, programs, and algorithms used for the anal]^ and ass^nbly of 

Incyte cDNA and full length sequences and provides applicable descriptions, references, and 
threshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, 
flie second column provides brief descriptions thereof, the third coluion presents appropriate 
references, all of which are incorporated by reference herdn in their entirety, and the fourth column 

15 presents, where applicable, the scores, probability values, and other parameters used to evaluate the 
strength of a match between two sequences (the hi^er the score or the low^ the probability value, 
the greats tiie idmtity betwem two sequences). 

The programs described above for the assernbly and analysis of ftill length polynucleotide 
and polypeptide sequences are also used to identify polynucleotide sequence firagmeots from SEQ ID 

20 NO:33-64. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization 
and arq)lification technologies are described in Table 4, column 2. 
IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative immune response associated proteins are initiaUy identified by running the Genscan 
gene id^itification program against public genonuc sequence databases (e.g., gbpri and gbhtg). 

25 Genscan is a general-purpose gene identification program which analyzes gCTonic DNA sequences 
from a variety of organisms (Burge, C. and S. Karlm (1997) J. MoL Biol. 268:78-94; Burge, C. and 
S. Karlin (1998) Curr. Opin. Struct BioL 8:346-354). The program concatenates predicted exons to 
fonn an assenibledcDNAsequCTce extending fcom a inethionine to a stop codo^ Ihe output of 
Genscan is a FASTA database of polynucleotide and polypeptide sequences. The maximumrange of 

30 sequence for Genscan to analyze at once is set to 30 Id). To detemmevv^ch of these Genscan 
predicted cDNA sequraces encode immune response associated proteins, the encoded polypeptides 
are analyzed by querying against PFAM models for imnoone response Potential 
immune response associated protoms are also identified by homology to Incyte cDNA sequences that 
have been annotated as inmmne response associated protdns. These selected Genscan-predicted 

35 sequences are then conipared by BLAST analysis to the genpept and gbpri public databases. Where 

83 



wo 2004/048550 



PCT/US2003/038178 



Oiscessaiy, the Genscan-predicted sequences are then edited by conapanson to the top BLAST hit 
from gwpcpt to correct etxors in Ibe sequmce predicted by G^can» such as estra or onitted exons. 
BLAST analysis is also used to find any Incyte cDNA or public cDNA coverage of the Genscan- 
predicted sequences, thus providing evidence for transcription. Wh^i Licyte cDNA coverage is 

5 available, this information is iised to correct or confirm the Genscanpre^ Pull length 

polynucleotide sequm:es are obtained by asseoibling Genscan-predicted coding sequ^ices with 
Incyte cDNA sequences and/or public cDNA sequences using the assenibly process described in 
Example IQ. Alternatively, full l&agth polynucleotide sequences are d^ved entirely from edited or 
unedited Genscan-predicted coding sequences. 

10 y. Assembly of Genomic Sequence Data with cDNA Sequence Data 
"Stitched" Sequences 

Partial cDNA sequences are extended with exons predicted by the Genscan grae 
idraitification program described in Exan^le IV. Partial cDNAs assernbled as described in Exan^le 
m are mapped to genonuc DNA and parsed into clusters containing related cDNAs and Genscan 

15 exon predictions from one or more genonnc sequences. Each cluster is analyzed using an algorithm 
based on graph theory and dynandc programming to integrate cDNA and genomic information, 
genearating possible splice variants that are subsequentiy confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in wMcb the entire length of the interval is pres^^ 
than one sequence in the cluster are identified, and intervals thus identified are considered to be 

20 equival^ by transitivi^. For example, if an interval is present on a cDNA and two genomic 

sequences, tiien all three intervals are considered to be equivalent. This process allows ujorelated but 
consecutive gCTomic sequraces to be brought together, bridged by cDNA sequence. Intervals tiius 
id^itified are then "stitched" together by the stitching algorithm in the ord^ that th^ ^PP^af along 
thdr parent sequences to generate the longest possible sequence, as wdl as sequ^ce variants. 

25 Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 
genomic sequence to genomic sequence) are given preference over linkages which change parent 
type (cDNA to genomic sequence). The resultant stitched sequ^ices are translated and compared by 
BLAST analysis to the grapept and gbpri public databases. Incorrect exons predicted by Genscan 
are corrected by con^arison to the top BLAST hit firom genpept. Sequences are forther extended 

30 with additional cDNA sequences, or by inspection of genomic DNA, \^en necessary. 
^^Stretched^^ Sequences 

Partial DNA sequences are extended to fidl length with an algorithm based on BLAST 
analysis. First, partial cDNAs asseidbled as described in Exan^le HI are queried against public 
databases such as the GenBank primate, rodent, mammaliaTi^ vertebrate, and eukaryote databases 

35 using the BLAST program The nearest GenBank protein homolog is then compared by BLAST 



84 



wo 2004/048550 PCT/US2003/038178 

analysis to dther lacyte cDNA sequences or GenScan exon predicted seque^nces described in 
Exaiiq)leIV. A chin^c protein is gmerated by nsisg the resultant Mgh-^^ 
(HSPs) to map fbe translated sequences onto the GenBanTc protein homolog. Ins^ons or deletions 
inay occur in the ddno^cprotdn with respect to the ongjnalGe^ Hie 
5 GenBank protdox homolog, the chimeric protdn, or both are used as probes to search for homologous 
genondc seqaences from the public human genonoe databases. Partial DNA sequences are th^efbre 
"stretched" or extmded by the addition of homologous g»omic sequences. Ihe resultant stretched 
sequences are examined to det^mine whether they contain a conq>lete gene. 
VI. Chromosomal Mapping of IRAJP Encoding Polynucleotides 

10 The sequences used to assemble SEQ ID NO:33-64 are compared with sequences from the 

Ihcyte LIFESEQ database and public domain databases using BLAST and other implementations of 
the Smith- Waterman algorithm. Sequences from these databases that matched SEQ ID NO:33-64 
are assenibled into clusters of contiguous and ovedapping sequences using assembly algorithms 
such as Phrap (Table 7)« Radiation hybrid and genetic mapping data available from public resources 

IS such as Ihe Stanford Human Genome Center (SHGC), Whitehead Institute for Genome Research 
(WIGR)> and G^n^thon are used to determine if any of the clustered sequences have been previously 
mapped. Inclusion of a mapped sequence in a cluster results in fixe assignment of all sequences of 
that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 

20 position of an interval, in centiMorgans, is measured relative to the terminus of ttie chromosome's p- 
aroL (The centiMorgan (cM) is a unit of measurement based on reconoibination frequencies between 
chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM 
distances are based on genetic maikers mapped by G6nethon which provide boundaries for radiation 

25 hybrid markers whose sequences were included in each of die clusters. Human genome maps and 
other resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 
(ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified disease genes 
map within or in proximity to the intervals indicated above. 
Vn. Analysis of Polynucleotide Egression 

30 Northern analysis is a laboratory technique used to detect the presence of a transcript of a 

gene and involves the hybridization of a labeled nucleotide sequence to a menibrane on which RNAs 
from a particular cell type or tissue have been bound (Sambrook and Russell, supra^ ch. 7; Ausubel 
et al., supra, ch. 4). 

Analogous conqniter techniques applying BLAST are used to search for identical or related 
35 molecules in databases such as GenBank or LIFESEQ (Incyte). This analysis is nooach faster than 



85 



wo 2004/048550 



PCTAJS2003/038178 



nmltiple msnabrane-based hybridizations. la addition, the sensitivity of the compter search can be 
nxDdified to deteonine Aether any particular match is categorized as exact or similar. The basis of 
the search is Ihe product score, \)diich is defined as: 

5 BLAST Score x Percent Idmtitv 

3 X minimum {lCTgtii(Seq. 1), lCTgth(Seq. 2)} 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. Theproduct score is a normalized value between 0 and 100, andis 

10 calculated as follows: the BLAST score is multiplied by the p^cent nucleotide identity and the 
product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
calculated by assigning a score of -l-S for every base that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (s^arated by 
gaps). If thm is more than one HSP, then the pair with the highest BLAST score is used to calculate 

15 the product score. The product score represents a balance betwew firactional ov^lap and quality in 
a BLAST alignn^iL For exanqde, a product score of 100 is produced only for 100% identity ov^ 
the entire length of the shorter of the two sequences bdng compared. A product score of 70 is 
produced dther by 100% idratily and 70% overlap at one end, or by 88% identity and 100% overlap 
at the other. A product score of 50 is produced either by 100% id^ty and 50% overlap at one end, 

20 or 79% identity and 100% overlap. 

Alternatively, polynucleotides encoding IRAP are analyzed vdth respect to the tissue sources 
firom which they are derived. For example, some fuQ length sequences are assenibled, at least in 
part, with ovedapping Incyte cDNA sequmces (see Exanaple JH). Each cDNA sequence is derived 
from a cDNA library constructed firom a human tissue. Each human tissue is classified into one of 

25 the following organ/tissue categories: cardiovascular system; connective tissue; digestive syst^n; 
endbiyonic structures; endocrine systemi; exocrine glands; genitalia, female; genitalia, male; germ 
cells; hesmc and immune systmi; liver; musculoslceletal syst^n; nervous systrao; pancreas; 
respiratory system; sense organs; skin; stomatognathic systraa; unclassifiedAnixed; or urinary tract 
The number of libraries in each category is counted and divided by the total nundber of libraries 

30 across all categories. Sinmlarly, each human tissue is classified into one of the following 

disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and other, and the numb^ of libraries in each category is counted and 
divided by the total numb^ of libraries across all categories. The resulting percentages reflect the 
tissue- and disease-specific expression of cDNA encoding IRAP. cDNA sequences and cDNA 

35 libraiy/tissue information are found in the LDFESEQ database (bicyte, Palo Alto CA). 



86 



wo 2004/048550 



PCT/US2003/038178 



Vni. Extension of IRAP Encoding Folynudeoticies 

Full Imgth polyiincleotides are produced by extension of an appropriate fragDoent of the full 
length molecule using oligonucleotide pruners designed from this fragmeoL One primer is 
synthesized to initiate 5' extension of the known fragment, and the other primer is synthesized to 
5 initiate 3' extension of the known fragment Tbie initial primers are designed using OLIGO 4.06 
software (National Biosci^sc^), or another appropriate program^ to be about 22 to 30 nucleotides in 
leng|li« to have a GC content of about 50% or more, and to anneal to the target sequ^ice at 
tenoperatures of about 68°C to about 72 '^C. Any stretch of nucleotides vMcSi would result inbairpin 
structures and primer-piimer dimeiizations is avoided. 

10 Selectedbuman cDNA libraries aieused to extmd the sequence. If more than one extrasion 

is necessary or desired, additional or nested sets of primers are designed. 

HighfideUly an^Mcation is obtained by PGR using inethods wen known in th^ PCRis 
preformed in 96-weIl plates using the PTC-200 thennal cycler (MJ Research, Inc.). The reaction ndx 
contains DNA terq>late, 200 nmol of each primer, reaction buffer containing Mg^, (NH4)2S04, and 

15 2-mercaptoethanol, Taq DNA polymerase (Amersham Biosciences), ELONGASB &azym 

(Invitrogen), and Pfu DNA polymerase (Stratagene), widi the following parameters for primer pair 
PCI A and PCI B: Step 1: 94^C. 3 min; Stqp 2: 94**C, 15 sec; Step 3: 60**C, 1 min; Step 4: 68^C, 2 
min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68**C, 5 nrin; Step 7: storage at 4**C. In the 
alternative, the parameters for piimi^ pair T7 and SK+ are as follows: Step 1 : 94''C, 3 xdn; Step 2: 

20 94°C, 15 sec; Step 3: 57**C, 1 XEair; Step 4: eS'^C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 
Step 6: 68 "^C, 5 min; Step 7: storage at 4'*C. 

The concmtration of DNA in each well is detemnned by dispensing 100 /il PICOGREEN 
quantitation reagent (0.25% (vAO PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 ^1 of undiluted PCR product into each well of an opaque fhioiimeter plate (Coming Costar, 

25 Acton MA), allowing the DNA to bind to the reagent. The plate is scanned in a Fluoroskan n 
(Labsystems Oy, Helsinki, Finland) to measure the fluoresc^ice of the san^le and to quantifjr the 
concentration of DNA A 5 A^l to 1 0 ^1 aliquot of the reaction mixture is analyzed by electrophoresis 
on a 1 % agarose gel to determine which reactions are successful in extending the sequence. 

The extended nucleotides are desalted and concentrated, transferred to 384-weU plates, 

30 digested with CviJI cholm virus CTdonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 1 8 vector (Amersham Biosciences). For shotgun 
sequencing, the digested nucleotides are sq)arated on low conceotration (0.6 to 0.8%) agarose gels, 
fragments are racised, and agar digested with Agar ACE (Pronoega). Extended clones were religated 
using T4 ligase (New England Biolabs, B&verly MA) into pUC 18 vector (Amersbam Biosciences), 

35 treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transfected 

87 



wo 2004/048550 



PCT/US2003/038178 



into canpeteat E, coli cells. Transfbrcoed ceUs aie selected on antibiotic-contaimng media, and 
individaal colonies aie picked and cultured ovonight at 37 ""C in 384-wea plates in lB/2x carb liquid 
media. 

TheceQs arelysed, andDNAis an^lified by PCR rising TaqDNApolym^seCAmrasham 

5 Biosciences) and Pfu DNA polyoierase (Stratagme) with the fbUowiDg parameters: Step 1 : 94'*C, 3 
min; Step 2: 94^*0, 15 sec; Step 3: 60*^0, 1 nrin; St^ 4: 72**C, 2 min; Step 5: steps 2, 3, and 4 
repeated 29 times; Step 6: 72*^0, 5 mm Step 7: storage at 4''C. DNA is quantified by PICCXjREEN 
leagent (Molecular Probes) as described above. San^les with low DNA recoveries are reamplified 
using the same conditions as described above. Sanqdes are diluted with 20% diniethysulfoxide (1 :2, 

10 v/v)» and sequenced using DYENAMIC energy transfer sequencing primi^ and the DYENAMIC 
DIRECT kit (Amersham Biosciences) or the ABI PRISM BIGDYE Tenmnator cycle sequencing 
rea^ reaction Mt (^plied Biosystems). 

In like manner, full length polynucleotides are verified using the above procedure or are used 
to obtain 5* regulatory sequences using the above procedure along with oligonucleotides designed for 

15 such extension, and an appropriate genonac library. 

IX. Identification of Single Nucleotide Polymorphisms in IRAP Encoding Polynucleotides 
Common DNA sequence variants known as sin^e nucleotide polymorphisms (SNPs) are 
identified in SEQ ID NO:33-64 using the UFESEQ database (Incyte). Sequences from the same 
gene aie clustered together and assembled as described in Example ID, allowing the identification of 

20 an sequence variants in die gene. An algorithm consisting of a series of filters is used to distinguish 
SNPs from other sequence variants. Preliminary filters remove the majori^ of basecall errors by 
xequiring a miniTmifn Phred quality score of IS, and remove sequence alignment errors and errors 
resulting from improper trimming of vector sequences, chimeras, and splice variants. An automated • 
procedure of advanced chromosome analysis is applied to the original chromatogram files in the 

25 vicinity of the putative SNP. Clone error filters use statistically generated algorithms to identify 
errors introduced during laboratory processing, such as those caused by reverse transcriptase, 
polymerase, or somatic mutation. Clustering error filters use statistically generated algorithms to 
identify errors resulting from clustering of close homologs or pseudogenes, or due to contamination 
by non-human sequences. A final set of filters removes duplicates and SNPs found in 

30 immunoglobulins or T-ceH receptors. 

Certain SNPs are selected for further characterization by mass spectrometry using the high 
throughput MASSARRAY system (Sequenom, Inc.) to analyze allele frequencies at the SNP sites in 
four different human populations. Hie Caucasian population comprises 92 individuals (46 male, 46 
female), including 83 from Utah, four French, three Venezualan, and two Amish individuals. Hie 

35 African population comprises 194 individuals (97 male, 97 fenMie), all African Americans. The 



88 



wo 2004/048550 



PCTAJS2003/038178 



Hispanic population conqoises 324 individuals (162 male, 162 fCToale), all Mexican Hispanic. Hie 
Asian popidatioii compnses 126 individuals (64 male, 62 female) with a lepoited parental 
bieakdown of 43% Chinese, 31% Japanese, 13% Koiean, 5% Vietnamese, and 8% other Asian. 
Allele frequencies are jBrst analyzed in fhe Caucasian population; in some cases those SNPs which 
5 show no allelic variance in this population are not further tested in the other three populations. 

X. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:33-64 are employed to screen cDNAs, 
genondc DNAs, or inRNAs. Although fh& labeling of oligonucleotides, consisting of about 20 base 
pairs, is specifically desocibed, esseotiailly the same procedure is used with larg^ nucleotide 

10 firagments. Oligonucleotides are designed using stateof-flie-art software such as OLIGO 4.06 
software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 //Ci of 
[y-^^P] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide Idnase (DuPont 
NEN» Boston MA). The labeled oligonucleotides are substantially purified using a SEPHADEX G- 
25 sup^dBne size exclusion dextraa bead column (Amersham Biosciences). An aliquot containing 

15 10^ counts per minute of the labeled probe is used in a typical meodbrane-based hybridization 
analysis of human genomic DNA digested with one of die following endonucleases: Ase I, Bgl n, 
BcoRI, Pst I, Xba I, or Pvu n (DuPont NEW). 

The DNA fiom each digest is fractionated on a 0.7% agarose gel and transferred to 
NYTRAN PLUS nylon menibranes (ScMdcher & Schudl, Durham NH). Hybridization is carried 

20 out for 16 hours at 40*^0. To remove nonspecific signals, blots are sequentially washed at room 
t^i:perature under conditions of up to, for example^ 0. 1 x saline sodium citrate and 0.5% sodium 
dodecyl sulfate. Hybridization patterns are visualized using autoradiography or an altamative 
imaging means and compared. 

XI. Microarrays 

25 The linkage or synthesis of array elements upon a imcroarray can be acbieved utilizing 

photolithography, piezoelectric printing (ink-jet printing; see, e.g., Baldeschwdler et al., supra), 
mechanical imcrospotdng technologies, and derivatives thereof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non-porous surface (Schena, M., ed. 
(1999) DNA Microarrays: A Practical Approacli . Oxford University Press, London). Suggested 

30 substrates include silicon, silica, glass slides, giiass chips, and silicon wafers. Altematively, a 

procedure analogous to a dot or slot blot may also be used to arrange and link elsooents to the surface 
of a substrate using thermal, UV, chemical, or mechanical bondmg procedures. A typical array may 
be produced using available methods and machiues well known to those of ordinary skill in the art 
and may contain any appropriate number of elements (Schena, M. et al. (1995) Science 270:467-470; 

35 Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat 



89 



wo 2004/048550 



PCT/US2003/038178 



Biotechnol 16:27-31). 

Fun length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof 
may conpiise the el^D&nts of the microatray. Fragments or oligomers suitaUe for hybridization caa 
be selected using software well known in the art such as LASERGENE software (DNASTAR). The 

5 array elements are hybridized with polynucleotides in a biological sanqple. The polynucleotides in 
the biologic£d san^le are conjugated to a fluorescent label or oth^ molecular tag for ease of 
detectioiL ARin bybiidization, nonhybridized nucleotides from the biological sample are rraioved, 
and a fluorescence scanner is used to detect hybridization at each array dement Alt^mtivdy, laser 
desorbtion and mass spectrometxy may be used for detection of hybridization. The degree of 

10 conoyplementarily and the relative abundance of each polynucleotide which hybridizes to an dement 
on the microarray may be assessed. In one eodbodiment, microarray preparation and usage is 
described in detail below. 
Tissue or Cell Sample Preparation 

Total RN A is isolated ftom tissue samples using the guanidinium diiocyanate method and 

15 poly(A)^ RNA is purified using the oligo-(dT) cellulose method. Each poly(A)* RNA saniple is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg//il oligo-(dT) primer (21mer), IX 
first strand buffi&r, 0.03 units/^l RNase inhibitor, 500 fOA dATP, 500 fiM dGTP, 500 iiM dTTP, 40 
jtiM dCTP, 40 iiM dCTP-Cy3 (BDS) or dCTP-Cy5 (AmCTsham Biosciences). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng poly(A)* RNA with 

20 GEMBRIGHT Idts (Incyte). Specific control poly(A)* RNAs are synthesized by in vitro 

transcription from non-codiag yeast genomic DNAu After incubation at 37° C for 2 hr, each reaction 
sample (one with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium 
hydroxide and incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA. 
Samples are purified using two successive CHROMA SPIN 30 gel filtration spin columns (BD 

25 Clontech, Palo Alto CA) and after combining, both reaction samples are ethanol precipitated using 1 
ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. Hie sample is then 
dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and resuspended in 
14/il5XSSC/0.2%SDS. 
Microarray Preparation 

30 Sequences of the present invention are used to generate array elements. Each array element 

is am plifi ed from bacterial cells containing vectors with cloned cDNA inserts. PGR amplification 
uses primers complementary to the vector sequences flanking the cDNA insert Array elements are 
amplified in thirty cycles of PGR from an initial quantity of 1-2 ng to a final quantity greater than 5 
jitg. Amplified array elements are then purified using SEPHAGRYL-400 (Amersham Biosciences). 

35 Purified array elements are immobilized on polymer-coated glass slides. Glass mi(m>scope 



90 



1 



wo 2004/048550 



PCT/US2003/038178 



slides (Comiiig) are cleaned by ultrasound in 0.1% SDS and acetone, wifli extensive distilled water 
wasbes between and after treatments. Glass slides are etched in 4% hydrofhioric acid (VWR 
Scientific Products Corporation (VWR), West CSiester PA), washed extensively in distilled water, 
and coated with 0.05% aminopropyl silane (Sigma-Aldrich, St Louis MO) in 95% ethanol. Coated 
5 slides are cured in a llOX oven. 

AinQT elements are applied to the coated glass substrate using a procedure described in U.S. 
Patent No. 5,807,522, incorporated herein by reference. 1 /xl of the array element DNA, at an average 
concentration of 100 ng/|(l, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. Hie apparatus then deposits about 5 nl of array element sample per slide. 

10 Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 

Microarrays are washed at room temperature once in 0.2% SDS and three times in distiQed water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline OPBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60°C followed by washes in 
0.2% SDS and distilled water as before. 

15 Hybridization 

Hybridization reactions contain 9 ^ of saiiq)le mixture consisting of 0.2 fig each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and covered 
with an 1.8 cm^ coverslip. The arrays are transferred to a waterproof chamiber having a cavity just 

20 slightly larger than a microscope slide. The chamber is kept at 100% htmiidity internally by the 
addition of 140 fil of 5X SSC in a comer of the chamber. The chamber containing the arrays is 
incubated for about 6.5 hours at 60° C. The arrays are washed for 10 min at 45° C in a first wash 
buffer (IX SSC, 0.1% SDS), three times for 10 minutes each at 45°C in a second wash buffer (O.IX 
SSC), and dried. 

25 Detection 

Reporter-labeled hybridization conoplexes are detected with a microscope equipped wifli an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral 
lines at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light 
is focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 

30 containing the array is placed on a compiter-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a 
resolution of 20 noicrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted ligjit is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 

35 Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two ftuorophores. 

91 



wo 2004/048550 



PCT/US2003/038178 



Appiopdate filteis positioned between tlie array and fhe photomultipli^ tabes are used to filter the 
signals. Hie emission maxiTna of flie ftuoropbores used aie 565 nm for Cy3 and 650 nm for Cy5. 
Each array is typically scanned twice, one scan per fhiorophore using fhe appropriate filters at the 
laser source, although fhe apparatus is capable of recording fhe spectra fiombofh fluoropboies 
5 simultaneously. 

The sensitivity of the scans is typically calibrated using die signal intensity gmerated by a 
cDNA control species added to fhe sample rmxture at a known concentration. ^ specific location on 
fhe array contains a conq)lenientaiy DNA sequence, allowing the intensity of fhe signal at that 
location to be correlated with a weight ratio of hybridizing species of 1:100,000. When two san^les 

10 from different sources (e.g., representing test and control cells), each labeled with a different 
fiuorophore, are hybridized to a single array for fhe purpose of identifying genes that are 
differentially expressed, fhe calibration is done by labeling samples of fhe calibrating cDNA with fhe 
two fluorophores and adding identical amounts of each to fhe hybridization mixture. 

The output of the photomultiplier tube is digitized using a 12-bit RH-835H analog-to-digital 

15 (A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-coixq>atible PC 
conqmter. The digitized data are displayed as an image where fhe signal intendty is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red O^gh 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, fhe data are first corrected for optical crosstalk (due to overlapping 

20 emission spectra) betweea die fluorophores using each fluorophoro's emission spectrum. 

A grid is superimposed over fhe fluorescence signal image such that the signal from each 
spot is centered in each element of the grid. The fluorescence signal within each element is then 
integrated to obtain a numerical value corresponding to fhe average intensity of the signal. The 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 

25 Array elements that exhibit at least about a two-fold change in expression, a signal-to-background 
ratio of at least about 2.5, and an element spot size of at least about 40%, are considered to be 
differentially expressed. 
Expression 

For exan5)le, SEQ ID NO:34 showed tissue-specific expression as determined by microarray 
30 analysis. RNA sanoples isolated from a variety of normal human tissues were corrpared to a 
common reference sarrple. Tissues contributing to fhe reference sarcple were selected for their 
ability to provide a complete distribution of RNA in fhe human body and include brain (4%), heart 
(7%), Mdney (3%), lung (8%), placenta (46%), small intestine (9%), spleen (3%), stomach (6%), 
testis (9%), and uterus (5%). The normal tissues assayed were obtained from at least three different 
35 donors. RNA from each donor was separately isolated and individually hybridized to the microarray. 

92 



wo 2004/048550 



PCTAJS2003/038178 



Since fliese hybridization experimmts were condacted using a conomon reference sample, 
difGeraidal @q>ression vataes are directly conoparable fmm one tissue to anoflier. The expression of 
SEQ ID NO:34 was increased by at least two-fold in testis as compared to the rderence sanqple. 
Therefore, SEQ ID NO:34 can be used as a tissue noarker for testis. 
5 SEQ ID NO:39 showed diffeareotial expression, as det^ntmned by ndcroarray analysis, in 

androgen-treated (noethyltrienolone (R1881), a synthetic androgen analog) human prostate tunx>r 
cdUs (DU-14S) as compared to untreated ceDs. DU-145 is a prostate carcinoma cell line isolated 
finooi 9 metastatic site in die brain. Control and androgen-treated cdls were compared at various tinoe 
points. Ihe expression of SEQ ID NO:39 was decreased by at least two-fold at the first time point 

10 only (4 hours). Therefore, in various ^nbodiments, SEQ ID NO:39 can be used for one or more of 
the loSowing: i) noonitoring treatment of prostate canc^, ii) diagnostic assays for prostate cancer, 
and iii) developing therapeutics and/or other treatmraits for prostate cancer. 

For example, expression of SEQ ID NO:41 was downregulated in diseased colon tissue 
versus normal colon tissue as detetxxnned by ndcroarray analysis. Gene expression profiles were 

15 obtained by con^aring normal colon tissue to colon adenocarcinoma tissue fixxm the same donor 
(Huntsman Cancer Institute, Salt Lake City, UT) by competitive hybridization. Expression of SEQ 
ID NO:41 was decreased at least two-fold in colon adenocarcinoma tissue when complied to normal 
colon tissue from the same donor. Thmfore, in various embodiments, SEQ ID NO:41 can be used 
for one or more of the following: i) monitoring treatment of colon cancer, ii) diagnostic assays for 

20 colon cancer, and iii) developing th^peutics and/or otho: treatments for colon cancer. 

For exan^le, SEQ ID NO:42 showed tissue-spedfic expression as determined by microarray 
analysis. RNA san[9>les isolated from a variety of normal human tissues were compared to a 
common refscence sample. Tissues contributing to the reference sample were selected for thdr 
ability to provide a complete distribution of RNA in the huimn body and include brain (4%), heart 

25 (7%), kidney (3%), lung (8%), placeaita (46%), small intestine (9%), spleen (3%), stomach (6%), 
testis (9%), and uterus (5%). The normal tissues assayed were obtained from at least three different 
donors. RNA from each donor was separately isolated and individually hybridized to the microarray. 
Since these hybridization espatinoMsnts were conducted using a common refer^ice sanople, 
differ^itial expression values are directly con^)arable from one tissue to another. The expression of 

30 SEQ ID NO:42 was increased by at least two-fold in esophagus tissue as coicpared to the reference 
sanq)le. Th^fore, SEQ ID NO:42 canbe used as a tissue marker for esophagus. 

In another exanqde, SEQ ID NO:42 showed tissue-specific expression as determined by 
miat>array analysis. RNA sanxples isolated from a variety of normal human tissues were compared 
to a common reference sair^le. Tissues contributing to the refi^rence sample wm selected for their 

35 ability to provide a complete distribution of RNA in the humau body and include brain (4%), heart 

93 



wo 2004/048550 



PCT/US2003/038178 



(7%), Mdney (3%), hing (8%), placCTta (46%), small intestine (9%), spleen (3%), stomach (6%). 
testis (9%), and uterus (5%). The noimal tissues assayed wore obtained from at least three difierent 
donors. RNA from each donor was separately isolated and individually hybridized to the microarray. 
Since these hybridization experiments wgtg conducted usiug a common refimnce sanople, 
5 differential expression values are directly conqiarable firom one tissue to another. Hie expression of 
SBQ ID NO:42 was increased by at least two-fold in gallbladder tissue as con^ared to the reference 
sanq>le. Therefore, SBQ ID NO:42 can be used as a tissue marker for gaUbladder. 

For exanople, expression of SEQ ID NO:45 was downregulated in treated breast tissue versus 
untreated breast tissue as determined by microarray analysis. Gene expression profiles of 

10 nonmalignant mammary qpithelial cdOs were conqpared to gme expression profiles of various breast 
carcinoma lines at different stages of tumor progression. The ceills were grown in defined serunoefiree 
H14 medium to 70-80% confiueuce prior to RNA harvest Cell lines compared included: a) HMEC, 
a primary breast q)ithe]ial cdl line isolated froma normal donor, b) MCF-lOA, abreast mammaiy 
gland cell line isolated from a 36-year-old woman with fibroc^tic breast disease, c) MCF7, a 

15 nonmalignant breast adenocarcinoma cell line isolated from the pleural effusion of a 69-year-old 
female, d) T-47D, a breast carcinoma cell line isolated from a pleural effusion obtained ficom a 54- 
year-old female with an infiltrating ductal carcinoma of tibie breast, e) Sk-BR-3, a breast 
adenocarcinoma cell line isolated from a malignant pleural effusion of a 43-year-old female, f) BT- 
20, a breast cardnoma cell line derived in vitro from cells ^mgrating out of thin slices of the tumor 

20 mass isolated from a 74-year-old female, g) MDA-nib-23 1 , a breast tumor cell line isolated from the 
pleural effusion of a 51-year-old female, and h) MDA-mb-435S, a spindle-shaped strain that evolved 
from the parent line (435) isolated by R. Cailleau from pleural effusion of a 31-year-oId female with 
metastatic, ductal adenocarcinoma of the breast. Expression of SEQ ID NO:45 was decreased 
between five- and nine^fold inMCFT, T-47D, Sk-BR-3, BT-20, and MDA-ixb-435S cefl lines. 

25 Therefore, in various eoibodinients, SEQ ID NO:45 can be used for one or more of the following: i) 
monitoring treatment of breast cancer, ii) diagnostic assays for breast cancer, and iii) developing 
therapeutics and/or oilier treatments for breast cancer. 

In yet another exanple, expression of SEQ ID NO:45 was upregulated in diseased colon 
tissue versus normal colon tissue as determined by microarray analysis. G^oe expression profiles 

30 w^e obtained by conq>aring normal sigmoid colon tissue from a donor to a sigmoid colon tumor 
originating from a metastatic gastric sarcoma (stromal tumor) fit)mtiie same donor (Huntsman 
Cancer Institute, Salt Lake City, UT). Expression of SEQ ID NO:45 was increased at least two-fold 
in the colon sigmoid colon tumor when conopared to normal sigmoid colon tissue from the same 
donor. Hierefore, in various CTabodiments, SEQ ID NO:45 can be used for one or more of the 

35 following: i) monitoring treatment of colon cancer, ii) diagnostic assays for colon cancer, and iii) 

94 



wo 2004/048550 



PCT/US2003/038178 



developiiig thi^apeutics and/or other treataieiits for colon cancer. 

In a furdier esanople, espiessian of SEQ ID NO:45 was npregulated ia diseased ovarian 
tissue v^sus norn^ ovarian tissue as determined hy nucroarray analysis. A normal ovary was 
con^aied to an ovarian tumor from the same donor (H^ Salt Lake City, ITT). 

5 Expression of SEQ ID NO:45 was increased at least two-fold inthe ovarian tumor tissue whoa 
con9>ared to normal ovarian tissue finmi the same donor. Theiefoie, in various eoibodiments, SEQ 
ID NO:45 can be used for one or more of the following: i) mouitoiing treatm»t of ovarian cancer, ii) 
diagnostic assays for ovarian cancer, and iii) developing therapeutics and/or other treatnients for 
ovarian canc^. 

10 For exan^le, expression of SEQ ID NO:S3 was down-regulated in colon tumor tissue versus 

nom^l colon tissue as determined by ndcroarray analysis. Expressicm of SEQ ID NO:S3 was 
decreased at least two-fold in matched colon tumor tissue v^^?us normal colon tissue from 5 of 14 
donors. Therefore, in various embodiments, SEQ ID NO:53 can be used for one or more of the 
following: i) monitoring treatment of colon cancer, ii) diagnostic assays for colon canc^, and iii) 

15 developing therapeutics and/or other treatments for colon canco:. 

In another exan^^le, expression of SEQ ID NO:S3 was up-regulated in lung tumor tissue 
versus normal lung tissue as determined by ndcroarray analysis. Expression of SEQ ID NO:S3 was 
increased at least two-fold in matched lung tumor tissue versus normal lung tissue firom 1 of 4 
donors. Therefore, in various embodim^its, SEQ ID NO:53 can be used for one or more of the 

20 following: i) monitoring treatment of lung cancer, ii) diagnostic assays for lung cancer, and iii) 
developing therapeutics and/or other treatments for lung cancer. 

In another exanq>le, expression of SEQ ID NO:56 was down-regulated in breast carcinoma 
cell lines vea^s ceDs derived from non-malignant fibrocystic breast epithelial cells as determined by 
micmarray analysis. Gene expression profiles of ncxDmalignant mammary ^ithelial ceDs were 

25 compared to gene expression profiles of various breast carcinoma lines at dif£^:ent stages of tumor 
progressioiL The cells wm grown in defined smim-firee H14 medium to 70-80% conftuCTce prior 
to RNAharvest Cell lines con^ared included: a> MCF-lOA, a breast mamnoary gland (luminal 
ductal cbaracteristics) cell line isolated from a 36-year-old woman with fibrocystic breast disease; b) 
MCF7, a nonmalignant breast adenocarcinoma cell line isolated fix)mthe pleural effosion of a 69- 

30 year-old female, c) T-47D, a breast carcinoma cdl lioe isolated from a pleural ^fosion obtained 
from a S4-year-old female with an infiltrating ductal carcinoma of the breast, d) Sk-BR-3, a breast 
adenocarcinoma ceU line isolated from a malignant pleural dSiision of a 43-year-old finale, e) BT- 
20, a breast carcinoma cell line derived in vitro from the cells emigrating out of thin slices of the 
tumor mass isolated from a 74-year-old female, and f) MDA-mb-231, a breast tumor cell line 

35 isolated from the pleural efiusion of a 5 1-year old female. Expression of SEQ ID NO:56 was 

95 



wo 2004/048550 



PCT/US2003/038178 



decreased at least two-fold in 3 breast caicmoiDa cdl lines (T-47D, BT-20, and MCF7) versos the 
non-malignant ceDs (MCF-lOA). Hierefore, in vaiioas mibodinusDts, SBQ ID NO:36 can be used 
for one or more of the following: i) monitoring treatment of breast cancer, ii) diagnostic assa^^ for 
breast canc^, and iii) developing thmpeutics and/or other treatments for breast carter. 
5 In anofliCT example, expression of SEQ ID NO:56 was down-regulated in breast tumor tissue 

versus normal breast tissue as detenxnned by nucroarray analysis. Expression of SBQ ID NO:S6 was 
decreased at least two-fold in matched breast lobular carcinoma tumor tissue versus normal breast 
tissue from one donor. Tlierefore, in various md>odiments, SEQ ID NO:56 can be used for one or 
nu>re of the following: i) monitoring treatment of breast cancer, ii) diagnostic assays for breast 

10 cancer, and iii) developing therapeutics and/or other treatments for breast canc&r. 

In anoth^ exanople, expression of SEQ ID NO:S6 was down-regulated in ovarian tumor 
tissue versus nonnal ovarian tissue as determined by ncdcroarray analysis. E3q>ression of SEQ ID 
NO:56 was decreased at least two-fold in matched ovarian tumor tissue versus normal ovarian tissue 
from one donor. Therefore, in various enibodimBnts, SEQ ID NO:56 can be used for one or more of 

15 the following: i) monitoring treatmi^ of ovarian cancer, ii) diagnostic assa^^ for ovarian cancer, and 
iii) developing therapeutics and/or other treatments for ovarian cancer. 

For example, expression of SBQ ID NO:56 was up-regulated in a prostate cancer cell line 
versus cells dervived from nomial prostate epithelium as determined by microarray analysis. Primary 
prostate epithelial cells wo^e compared with prostate carcinomas representative of the different 

20 stages of tumor progression. Cell lines conopared included: a) PrEC, a primary prostate epithelial 
cell line isolated from a normal donor, b) DU 145, a prostate carcinoma cell line isolated from a 
metastatic site in the brain of 69-year old male with widespread metastatic prostate carcinoma, c) 
LNCaP, a prostate carcinoma cell line isolated from a iycaph node biopsy of a 50-year-old male with 
metastatic prostate carcinoina, and d) PC-3, a prostate adenocarcinoma cdl line isolated from a 

25 metastatic site in die bone of a 62-year-old male with grade IV prostate adenocarcinoma. Cell lines 
were grown in basal noedia in the abs^ice of growlli factors and hormones and were conpared to 
normal PrECs grown under the same conditions. Expression of SEQ ID NO:56 was iiK^reased at 
least two-fold in one prostate cancer cell line (DU 145) v&csus PrECs. Therefore, in various 
enibodiments, SEQ ID NO:56 can be used for one or more of fiie following: i) roDnitoring treatnaent 

30 of prostate cancer, ii) diagnostic assays for prostate cancer and iii) developing therapeutics and/or 
other treatments for prostate canc^. 

In two oflier examples, SEQ ID NO:53 and SEQ ID NO:58 showed tissue-specific 
expression as determined by microarray analysis. RNA saa5)les isolated from a variety of normal 
human tissues were compared to a common reference san5)le. Tissues contributing to the reference 

35 sample were selected for their ability to provide a complete distribution of RNA in the human body 

96 



wo 2004/048550 



PCTAJS2003/038178 



and include brain (4%), heart (7%), Tddnsy (3%), lung (8%), placCTta (46%), snaall intestine (9%), 
spleen (3%), stomach (6%), testis (9%), and uterus (S%). The normal tissues assayed were obtained 
£rom at least three diil^:ent donors. RNAfromeachdonor was separately isolated and individually 
hybridized to the mioroarray. Since these hybridization experiments were conducted using a 
5 common reference sample, differential «pression values are directiy conqparable from one tissue to 
another. Li one exanople the e^^ression of SEQ ID NO:S3 was increased by at least two-Md in 
salivary gland as compared to the reference san^le. Ihe^refbre, SEQ ID NO:53 can be used as a 
tissue mark^ for salivary gland. In a second exanq)le, SEQ ID NO:58 was increased by at least two- 
fold in spleen and tonsils as compared to the refbrrace sanqple. Thmfore, SEQ ID NO:S8 can be 

10 used as a tissue marker for spleen and tonsils. 

For example, expression of SEQ ID NO:63 showed differential expression in breast 
carcinoma cell lines corcpared with a primary culture of epithelial cdls derived firomnormal breast 
tissue as determined by microarray analysis. The gene espressicm proffle of a nonmalignant 
mammary epithelial cell line was compared to the gene expression proiSles of breast carcinoma lines 

15 at different stages of tumor progression. Cell lines compared included: a) BT-20, a breast cardnoma 
cell line derived in vitro from the cells emigrating out of thin slices of tumor mass isolated from a 
74-year-old finale, b) BT-474, a breast ductal carcinoma cell line that was isolated from a solid, 
invasive ductal carcinoma of the breast obtained firom a 60-year-old woman, c) BT-483, a breast 
ductal carcinoma cdl line that was isolated j&om a papillary invasive ductal tumor obtained from a 

20 23-year-old normal, ntienstruating, parous female with a family history of breast cancer, d) Hs 578T, 
a breast ductal carcinoma cell line isolated from a 74-year-old female with breast carcinoma, e) 
MCF7, a nonmalignant breast adenocarcinoma ceil line isolated from the pleural effusion of a 69* 
year-old female, f) MCF-lOA, a breast mammary gland (luminal ductal characteristics) cell line 
isolated from a 36-year-old woman with fibrocystic breast disease, g) MDA-MB-468, a breast 

25 adenocarcinoma cell line isolated from the pleural ^fusion of a Sl-year-old female with metastatic 
ad^iocarcinoma of the breast, andh) HMEC, a primary breast epithelial cell line isolated froma 
normal donor. Expression of SEQ ID NO:63 was increased at least two-fold in one breast cancer 
cell line (Hs S78T) and decreased at least two-fold in one breast cancer cell line ^T-474. Although 
expression of SEQ ID NO:63 was not affected in the same manni^ among all breast carcinoma cell 

30 lines, the data suggest that in some populations or stages of breast cancer SEQ ID NO:63 is 

difG^ieDtially expressed* Therefore, in various eiribodiments, SEQ ID NO:63 can be used for one or 
more of the following: i) monitoring treatment of breast cancer, ii) diagnostic assays for breast 
cancer, and iii) developing therapeutics and/or other treatm^aits for breast cancer. 

For example, expression of SEQ ID NO:64 was down-regulated in preadipocytes taken from 

35 an obese donor versus preadipocytes taken from a non-obese donor as determined by microarray 

97 



\ 



wo 2004/048550 



PCT/US2003/038178 



analysis. Primaiy subcutaneous preadipocytes were isolated from the adipose tissue of a non-obese 
donor, a 28-year-old besdlliy fsmale with body mass index (BMI) of 23.59, and an obese donor, a 40- 
year-old bealthy female with a body mass indss. (BMI) of 32.47. The pieadipocytes from each donor 
were cultured and induced to difGsentiate into adipocytes by growing them in diff^:^ra]tiation 
5 medium containing PPAR-^ agonist and bxmian insulin (Z^ Somethiazolidinediones or 

PPAR-Y agonists, which bind and activate an orphan nuclear receptor, PPAR-y, have been shown to 
induce human adipocyte differentiation. The preadipocytes were treated with human insulin and 
PPAR-7 agonist for 3 days and subsequent^ w&e switched to medium containing insulin for a range 
of time periods ranging from one to 20 days before the cells w^e collected for analysis. 

10 DiCferentiated adipocytes from each donor were conq>ared to untreated preadipocytes, maintained in 
culture in the abs»ce of differentiationrindudng agists, from the same donor. Between 80% and 
90% of the preadipocr;^ finally differentiated to adipocytes as observed under phase contrast 
microscopy. Expression of SEQ ID NO:64 was decreased at least two-fold in differentiated 
preadipocytes from an obese donor versus non-di£fer^itiated preadipocytes from the same donor. In 

15 contrast, no differ^itial expression was seen in diff^entiated preadipocytes from a non-obese donor 
versus non-differentiated preadipocytes fromthe same donor. These data suggest lhat SEQ ID 
NO:64 is differentially expressed in adipocytes from obese subjects but not in adipocytes from non- 
obese subjects. Thus, SEQ ID NO:64 is useful for the diagnosis, prognosis, or treatment of diabetes 
mellitus, obesity, hypertension, and atherosclerosis. Therefore, in various ernbodiments, SEQ ID 

20 NO:64 can be used for oxie or more of the following: i) monitoring treatment of diabetes noellitus, 
obesity, hypertension, and ath^osclerosis, ii) diagnostic assays for diabetes mellitus, obesity, 
hypertension, and atherosclerosis, and iii) developing th^apeutics and/or other treatments for 
diabetes mdlitus, obesity, hypertension, and atherosclerosis. 

In another exantple, SEQ ID NO:61 showed tissue-specific expression as determined by 

25 microarray analysis. RNAsanogplesisolatedfroma variety of nornial human tissues wmc^ 

to a common reference san^de. Tissues contributing to the reference san^le were selected for their 
ability to provide a conqdete distribution of RNA in the human body and include brain (4%), heart 
(7%), kidney (3%), hmg (8%), placenta (46%), smaU mtestine (9%), spleen (3%), stomach (6%), 
testis (9%), and uterus (5%). The normal tissues assayed were obtained from at least three different 

30 donors. RNA from each donor was separately isolated and individually hybridized to the nucroarray. 
Since these hybridization expeiiments were conducted using a common reference sample, 
differential expression values are directly comparable from one tissue to another. The expression of 
SBQ ID NO:61 was increased by at least two-fold in thymus gland as conopared to the refi^tm^e 
sai3q)le. Therefore, SEQ ID NO:61 can be used as a tissue marker for thymus gland. 

35 

98 



wo 2004/048550 



PCTAJS2003/038178 



Xn. Coiivlementary Polynucleotides 

Sequences coniplcsnBaataxy to IRAP-CTCodmg sequences, or any parts thereof, aiensed to 
detect, decrease, or inhibit expression of naturally occurring IRAP. Although use of 
oligomicleotides conapdsing from about IS to 30 base pairs is described, essentially the sanie 
5 procedureisused with snoaner or with larger sequence fragments. Appropriate oligonucleotides are 
designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of IRAP. To 
ixAibit transcription, a conaplCTaentary oligonucleotide is designed from the most unique S* sequence 
and Qsed to prevent promoter binding to the coding sequence. To inhibit translation, a 
coo:q)Iementaiy oligonucleotide is designed to prev^ ribosomal binding to the IRAP-encoding 

10 transcript 

XflL E3q[>ression of IRAP 

Expression and purification of IRAP is acUeved using bacterial or virus-based expression 
systems. For expression of IRAP in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that directs bigh levels of cDN A 

15 transcription. Exancples of such promoters include, but are not linuted to, the trp-lac (fac) hybrid 
promoter and the T5 or T7 bacteriophage pioxooter in conjunction with the lac operator regulatory 
element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). 
Antibiotic resistant bact^a express IRAP upon induction with isopropyl beta-D- 
thiogalactopyranoside (IPTG). Expression of IRAP in eukaryotic cells is achieved by infecting 

20 insect or mammalian cdl lines with recombinant Autographica califomica nuclear polyhedrosis 
virus (AcMNPV), commonly known as baculovirus. The nonessential polyhedxin gene of 
baculovirus is replaced with cDNA encoding IRAP by eith^ homologous recornbination or 
bacterial-mediated transposition involving transfer plasmid intermediates. Viral infectivity is 
maintained and the strong polyhedrin promote drives high levels of cDNA transcription. 

25 Recombinant baculovirus is used to infect Spodoptera frugiperda (Sf9) insect cells in most cases, or 
human hepatocytes, in some cases. Infection of the latter requires additional genetic modifications to 
baculovirus (Engelhard, E.K. et al. (1994) Proc. Nafl. Acad. ScL USA 91:3224-3227; Sandig, V. et 
al. (1996) Hum. GeneTher. 7:1937-1945). 

In most expression systems, IRAP is synthesized as a fusion protein with, e.g., glutathione S- 

30 transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
a£Bnity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26- 
Idlodalton enzyme from Schistosoma japonicmi, enables the purification of fusion proteins on 
immobilized glutathione under conditions that maintain protdn activity and antigenicity (Amersham 
Biosciences). Following purification, the GST noDiety can be proteolytically cleaved fi-om IRAP at 

35 specifically engineered sites. FLAG, an 8-anmo acid peptide, enables immunoaffinity purification 



99 



wo 2004/048550 



PCTAJS2003/038178 



using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 
6-His, a stretch of six consecutive bistidine residues, enables puiification on metal-cbelate resins ' 
(QIAGEN). Methods for protdn expression and purification are discussed in Aasubd et aL {suprcLy 
cb. 10andl6). PurifiedlRAPobtainedby these inethods can be tised directly in the assays sho^ 
S Examples XVn and XVm, where applicable. 
XIV. Fimctioiial Assays 

IRAP function is assessed by expressing the sequences ^coding IRAP at physiologically 
elevated levels in mammalian cell culture s^tems. cDNA is subcloned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 

10 include PCMV SPORT plasmid (Ihvitrogeai, Cadsbad CA) and PCR3. 1 plasnid (Invitrogen), both 
of which contain the cytomegalovirus promoter. 5-10 iMg of recordbinant vector are transiently 
transfiscted into a human cell line, for exan^le, an ^idothdial or hmiatopoietic cell line, using either 
liposome fDnxnilations or electroporatioiL 1-2 /^g of an additional plasmid containing sequences 
encoding a marki^ protein are co-transfected. Expression of a marker protein provides a means to 

15 distinguish transfected cells from nontransfected cells and is a rdiable predictor of cDNA expression 
firomthe recombinant vector. Marker proteins of choice include, e.g.. Green Fluorescent Protdn 
(GEP; BD OontecOi), CD64, or a CD64-GFP fusion protdn. Flow cytometry (FCM), an automated, 
laser optics-based technique, is used to identify transfected cdls expressing GFP or CD64-GFP and 
to evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies 

20 the uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. 
These events include changes in nuclear DNA content as measured by stainmg of DNA with 
propidium iodide; changes in cell size and granularity as measured by forward light scatter and 90 
degree side ligjit scatter; down-regulation of DNA synthesis as measured by decrease in 
bromodeoxyuridine uptake; att^ations in expression of cell surface and intracellular protdns as 

25 measured by reactivity with specific antibodies; and alt^ations in plasma membrane composition as 
measured by the binding of fiuorescem-conjugated Annexin V protein to the cell surface. Methods 
in flow cytometry are discussed in Qrmi^od, M.G. (1994; Flow Cvtometrv . Oxford, New York NY). 

Hie influence of IRAP on gene expression can be assessed using highly purified populations 
of ceQs transfected with sequences ^coding IRAP and dither CD64 or CD64-GFP. CD64 and 

30 CD64-GFP are exjiressed on the surface of transfected cells and bind to conserved regions of human 
immunoglobulin G (IgG)- Transfected cells are effici^y separated ftom nontransfected cells using 
noagnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success 
NY). mRNA can be purified from the cells using noethods well known by those of skill in the art. 
Expression of mRNA encoding IRAP and other genes of interest can be analyzed by northern 

35 analysis or microarray techniques. 

100 



wo 2004/048550 



PCTAJS2003/038178 



XV. Production of IRAP Specific Antibodies 

IRAP substaiitially puiified using polyaciylannde gel electrophoresis (PAGE; see, e.g., 
Harrington, (1990) Methods EnzyimL 182:488-495), or other purification techniques, is used 
to immunize animals (e.g., rabbits, ndce, etc.) and to produce antibodies using standard protocols. 
5 Alternatively, the IRAP anuno acid sequ^ce is analyzed using LASERGENE software 

(DNASTAR) to determine re^ons of high immunogeuicity, and a corresponding oligopeptide is 
syodiesized and used to raise antibodies by means Imown to those of sl^ Methods for 

selection of appropriate epitopes, such as those near the C-t^minus or iu bydrophilic regions are 
wen described in the art (AusubeL et al., supra, ch. 1 1). 

10 Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431 A 

p^ytide synthesizer (Applied Biosystems) using FMCK! chetmstcy and coupled to KLH (Sigma- 
Aldiicb, St Louis MO) by reaction with N-maleiimdobenzoyl-N-bydroxysuccinimide est^ ^IBS) to 
increase immunog^cit^ (Ausubel et al., supra). Rabbits are immunized with the oligopeptide-KLH 
conqdex in con^lete Freund's adjuvant Resulting antisera are tested for antipeptide and anti-IRAP 

15 activily by, for exao^le, binding the p^tide or IRAP to a substrate, blocking with 1% BSA, reacting 
with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. 

XVI. Purification of Naturally Occurring IRAP Using Specific Antibodies 
Naturally occurring or recombinant IRAP is substantially purified by immunoafOnity^ 

chromatography using antibodies specific for IRAP. An iinanmoafGmty column is constructed by 
20 covalaitly coupling anti-IRAP antibody to an activated chromatographic resin, such as 

CNBr-activated SEPHAROSE (AmBTsham Biosci^sces). After the coupling, the resin is blocked 

and washed according to the manufacturer's instractions. 

Media containing IRAP are passed over the inamunoaffini^ colunon, and die column is 

washed under conditions tiiat allow ttie preferential absorbance of IRAP (e.g., high ionic strength 
25 buffers in the presence of detergent). The column is duted under conditions that disrapt 

antibodyyiRAP binding (e.g., abiifferof pH2topH3, or ahighconcentrationof achaotrope, such 

as lurea or tiiiocyanate ion), and IRAP is collected. 

XVn. Identification of Molecules Which Interact with IRAP 

IRAP, or biologically active fragments th^of , are labeled with "^I Bolton-Hunt^ reagent 
30 (Bolton, AE. and W.M. Hunter (1973) BiocShem. J. 133:529-539). Candidate molecules previously 

arrayed in the weDs of a multi-well plate are incubated with the labeled IRAP, washed, and an^ wells 

with labeled IRAP cowpl&x are assayed. Data obtained using different concentrations of IRAP are 

used to calculate values for the nunfl>er, affinity, and association of IRAP with the candidate 

molecules. 

35 Alternatively, molecules interacting widi IRAP are analyzed using the yeast two-hybrid 



101 



wo 2004/048550 



PCT/US2003/038178 



system as described in Fields, S. and O. Song (1989; Nature 340:245-24^, or using commercially 
available Mts based on flxe two-hybrid system, such as the MATCHMAKER systrai (BD Clontech). 

IRAP may also be used in flie PATHCALUNG process (CuraGen Corp., New Havm CT) 
^^ch &oaploys the yeast two-hybrid systemin a high-ftiroughput manner to detemnne all 
5 interactions betwem the protdns encoded by two large libraries of genes (I^andabalan, K. et al, 
(2000) U.S. Patent No. 6,057,101). 
XVm. Demonstration of IRAP Activity 

An assay for IRAP activity measures the ability of IRAP to recognize and precipitate 
antig^ from serum. This activity can be measured by the quantitative precipitin reaction (Gohib, 

10 E.S. et al. (1987) inmniitinlnpY: A 5^yntheCTg SinaucT Associates, Sunderland, MA, pages 1 13-1 15). 
IRAP is isotopically labeled using nisthods koowninthe art Various seromconc^itratiQns are 
added to constant amounts of labeled IRAP. IRAP-antigen complexes precipitate out of solution and 
are collected by centrifugation. The amount of predpitable IRAP-antigen complex is proportional to 
the amount of radioisotope detected in the precipitate. The anoount of precipitable IRAP-antigen 

15 complex is plotted against the seium concraitration. For various serum concoitrations, a 
charactmstic precipitin curve is obtained, in which the amount of precipitable IRAP-antigen 
con^dex initialty increases proportionately with increasing serum concentration, pealcs at the 
eqaivaloQce point, and then decreases proportionately wifli further increases in serum concentration. 
Thus, the amount of precipitable IRAP-antigen complex is a ntieasure of IRAP activity which is 

20 characterized by sensitivity to both linuting and excess quantities of antigen. 

Alternatively, an assay for IRAP activity measures the expression of IRAP on the cell 
surface. cDNA encoding IRAP is transfected into a non-leubxytic ceH 1^ Cell surface protems 
are labeled with biotin (de la Fuente, M.A. et aL (1997) Blood 90:2398-2405). 
Immunoprecipitations are performed using IRAP-specific antibodies, and immunoprecipitated 

25 sanples are analyzed usmg SDS-PAGE and imnmnoblotdng techniques. The ratio of labeled 

imnmnoprecipitant to unlabeled immunoprecipitant is proportional to the amount of IRAP exjiressed 
on the cell surface. 

Alternatively, an assay for IRAP activity measures liie amount of cell aggregation induced 
by ovearexpression of IRAP. In fliis assay, cultured ceDs such as NIH3T3 are transfected with cDNA 

30 encoding IRAP contained within a suitable mammalian expression vector under control of a strong 
promoter. Cotransfection with cDNA encoding a fiuorescmt marker protein, sucb as Green 
HuoresceotProtdn (CLONTECH), is useM for identifying staW The amount of cell 

agglutination, or clumping, associated with transfected cells is canq>ared with that associated with 
untransfected cdls. The amount of cell agglutination is a direct measure of IRAP activity. 

35 IRAP protease activity is measured by flie hydrolysis of appropriate synthetic peptide 



102 



wo 2004/048550 PCT/US2003/038178 

substrates conjugated with vaxious cbiomogenic molecules in which the degree of hydrolysis is 
quantified by spectcophotometnc (or fluorometcic) absorption of the released chromophore (Beynon, 
R.J. and J.S. Bond ri9QA^ Prnteolytic HnTymegr A Practical Approach. Oxford University Press, 
New York, NY, pp. 25-55). Peptide substrates are designed according to the category of protease 
5 activity as endopeptidase (serine, cysteine, aspartic proteases, or metalloproteases), aminopeptidase 
(leucine aminopeptidase), or caiboxypeptidase (carboxypeptidases A and B, procollagen C- 
proteinase). Commonly used chromogens are 2-napihdiylamine, 4-nitroaniline, and fiuylacrylic acid. 
Assays are performed at anibient temperature and contain an aliquot of the enzyme and the 
appropriate substrate in a suitable buffer. Reactions are carded out in an optical cuvette, and the 

10 increase/decrease in absorbance of the chromogen released during hydrolysis of the peptide substrate 
is measured. Ihe change in absorbance is proportioixal to the enzyme activity in the assay. 

In the alternative, an assay for IRAP protease activity takes advantage of fluorescence 
resonance energy transfer (FRET) that occurs when one donor and one acceptor fluorophore with an 
^ropnate spectral ovedap are in close proximity. A flexible peptide linker containing a cleavage 

15 site specific for IRAP is fused between a red-shifted variant (RSGEP4) and a bhie variant (BFP5) of 
Green Fluorescent ProteiiL This fusion protein has spectral properties that suggest energy transfer is 
occurring fiom BFP5 to RSGFP4. When flie fusion protein is incubated with IRAP, flie substrate is 
cleaved, and the two fluorescent proteins dissociate. This is accoirqpanied by a maiked deciease in 
energy transfer which is quantified by comparing the emission spectra before and after the addition 

20 of IRAP (Mitra,R.D.etalici996) Gene 173:13-17). This assay can also be performed in Uving cells. 
In this case the fluorescent substrate protein is expressed constitutively in cells and IRAP is 
introduced on an inducible vector so that FRET can be monitored ia the presence and absence of 
IRAP (Sagot, 1. et al (1999) FBBS Letters 447:53-57). 

Various modifications and variations of the described con5)ositions, methods, and systems 

25 of the invention wiH be apparent to those sMDed in the art without departing from the scope and 
spirit of the invCTtion, It will be appreciated that the invention provides novel and useful proteins, 
and fhdr encoding polynucleotides, ^iiicb can be used in the drug discovery process, as well as 
methods for using these compositions for the detection, diagnosis, and treatment of diseases and 
conditions. Although the invention has been described in connection with certain erdbodiments, it 

30 should be understood that the invention as claimed should not be unduly limited to such specific 

eaibodtments. Nor should the description of such embodiments be considered esthaustive or limit the 
invention to the precise forms disclosed. Furthermore, elements from one erdbodiment can be 
readily recorribined with elements from one or more other einbodiments. Such coxnbinations can 
form a nuniber of embodiments within the scope of the invention. It is intended that the scope of the 

35 invention be defined by the following claims and their equivalents. 



103 



wo 2004/048550 



Table 1 



PCT/US2003/038178 



bicyte Project ID 


Polypeptide 
SEQIDNO: 


Incyte 

Polypeptide ID 


Polynucleotide 
SEQ ID NO: 


Incyte 

Polynucleotide ID 


75^043 


1 


7522043CD1 


33 


7522043CB1 


7523539 


2 


7523539CD1 


34 


7523539CB1 


7523587 


3 


7523587CD1 


35 


7523587CB1 


7523622 


4 


7523622CD1 


36 


7523622CB1 


7523711 


5 


752371 ICDl 


37 


752371 ICBl 


7523729 


6 


7523729CD1 


38 


7523729CB1 


7523763 


7 


7523763CD1 


39 


7523763CB1 


7523006 


8 


7523006CD1 


40 


7523006CB1 


7523261 


9 


7523261CD1 


41 


7523261CB1 


7523277 


10 


7523277CD1 


42 


7523277CB1 


7523279 


11 


7523279CD1 


43 


7523279CB1 


7523296 


12 


7523296CD1 


44 


7523296CB1 


7521779 


13 


7521779CD1 


45 


7521779CB1 


7521826 


14 


7521826CD1 


46 


7521826CB1 


7521901 


15 


7521901CD1 


47 


7521901CB1 


7522003 


16 


7522003C3>1 


48 


7522003CB1 


7522014 


17 


7522014CD1 


49 


7522014CB1 


7522038 


18 


7522038CD1 


50 


7522038CB1 


7523429 


19 


7523429CD1 


51 


7523429CB1 


7523941 


20 


7523941CD1 


52 


7523941CB1 


7524607 


21 


7524607CD1 


53 


7524607CB1 


7524690 


22 


7524690CD1 


54 


7524690CB1 


7524733 


23 


7524733CD1 


55 


7524733CB1 


7522128 


24 


7522128CD1 


56 


7522128CB1 


7522158 


25 


7522158CD1 


57 


7522158CB1 


7524191 


26 


7524191CD1 


58 


7524191CB1 


7525225 


27 


7525225CD1 


59 


7525225CB1 


7513053 


28 


7513053CD1 


60 


7513053CB1 


7513086 


29 


7513086CD1 


61 


7513086CB1 


7513557 


30 


7513557CD1 


62 


7513557CB1 


7513718 


31 


7513718CD1 


63 


7513718CB1 


7514003 


32 


7514003CD1 


64 


7514003CB1 



104 



wo 2004/048550 



PCT/US2003/038178 




105 



wo 2004/048550 



PCT/US2003/038178 



a 
o 

I 



I 



CO 



^9 
6 



o 

Q 



11. 

t 
s o 



too 



8 



1 i 



CO 



VO 

oo 

On 
tri 
cn 



I 



I 



.3 



.a 



is 



a 



o .S 



c4 



1—* 

ON 

NO 



5 too ; 



05 -H 

■ m 



106 



wo 2004/048550 



PCTAJS2003/038178 



CM 




107 



wo 2004/048550 



PCT/US2003/038178 



CM 




108 



wo 2004/048550 



PCTAJS2003/038178 




109 



wo 2004/048550 



PCT/US2003/038178 




110 



wo 2004/048550 



PCT/US2003/038178 




111 



wo 2004/048550 



PCT/US2003/038178 



^ b 



u O 



I I 

ill 

00 M 



a 
I 



4^ 

§ 
I 



8 

H 



1 



5 



I 



1 



1 1 



i3 



o 
p4 



o S 

II 



.s 
I 

a 



8 



i ^ i 

a8 e 



•a op 



I 



I 



1 

^ § 

O 



I 



3 

p6 




CO 

oo 
oo 

ss 

CO 



a 



6 



ex 

1.. 
£3 



112 



wo 2004/048550 



PCTAJS2003/038178 



ti B =5 



n 



.1" 



■2 



o 

<3 



2 ■ 



1 



ON ^ So PQ 

ll 



1^ 

■s 

I 

I 

s 



3 



o 
o 
oo 
m 

vo 




13 ^ 

ed 




113 



wo 2004/048SS0 



PCT/US2003/038178 




114 



wo 2004/048550 



PCT/US2003/038178 




115 



wo 2004/048550 



PCT/US2003/038178 



CM 




116 



wo 2004/048550 



PCT/US2003/038178 




117 



wo 2004/048550 



PCT/US2003/038178 




118 



wo 2004/048550 



PCT/US2003/038178 



O O 

11 



11 

a 
t 



8 



I 



.a 



■a 



e 



8 



•<3 cs 



i 

1 



V M« 

1^ § 



■^i i 



Q. at 



5 b 



11 



119 



wo 2004/048550 



PCT/US2003/038178 



I 



t 

O O 

43 CO 



I 

o 



81 
§1 

s 



1 

o o 
c: g 

5 a> 



o 



.2 
I 

•a* 



Is 




Ml 



S3 



12 

g rn 
B CM 

I is 



120 



wo 2004/048550 



PCTAIS2003/038178 



CO 




121 



wo 2004/048550 



PCT/US2003/038178 




122 



wo 2004/048550 



PCT/US2003/038178 




123 



wo 2004/048550 



PCT/US2003/038178 



i 

•a 



I 



o 



o 
CO 

I 

w 

< s 

g -g 



'Qi 



§ 

1 



s 



§ 



I 



O 
I 

00 



ON ! 



On 



8i 

I 

oo 

so 
m 

<i 






00 



a 



VO 
m 
CO 

CO 
CM 

cn 
CO 
oo 

CO 

a 

CO 

o 

?} 

CO 
CO 

CO 

CM 



CO 
CO 
CO 

s 



a 

I- 



50 



0 



1 



OS 



I' 



•X3 

1 



If 



VO 

o\ 

•.J 00 
T o 




i 
I 



124 



wo 2004/048550 



PCTAJS2003/038178 



cn 




125 



wo 2004/048550 



PCT/US2003/038178 



Analytical Methods 
and Databases 


BLASTJ>RODOM 


BLASTJDOMO 1 


BLASTJDOMO 


MOTIFS 1 


MOTIFS 


SPSCAN 1 


HMMER 


H 






PROHLESCAN | 


BLIMPS J»RINTS 


1 
1 


BLASTJDOMO | 


BLASTJDOMO | 


BLASTJDOMO 


MOTIFS 1 


MOTIFS 1 


MOTIFS 1 


MOTIFS 1 


SPSCAN 1 


HMMER 1 


HMMER 1 


HMMER 1 




i 
1 


1 

•J 

V 

C 
'5 

£ 
1 

g 


4 

t 

! 

i 

1 

I 

1 

1 

b 

! 


PRECURSOR SIGNAL CYTOTOXIC T-LYMPHOCYTE PROTEIN CTLA4 
IMMUNOGLOBULIN FOLD T-CELL TRANSMEMBRANE PD012955: M1-F56 


T-CELL SURFACE GLYCOPROTEIN CD28 DM03346|P16410|l-222: M1-P158 


T-CELL SURFACE GLYCOPROTEIN CD28 DM03346|P31043| 1-220: V40-V151 


Potential Phosphorylation Sites: S49 S159 T96 Y127 


Potential Glycosylation Sites: N113 N145 


1 

1 
1 

JS 

1 


Signal Peptide: H26-A46 


^ 

H 
J 

■A 

i 

i 
1 

S 

^ o 

a 9 

3 1 

|l 
tl 


Cytosolic domain: M1-P31 
Transmembrane domain: L32-S54 
Non-cytosolic domain: K55-C254 


C-type lectin domain signature and profile: S205-R251 


Type n antifieeze protein signature PR00356: P125-C137, C137-C154, A155-F172, F208-A219, 
W234-I247 


C-TYPE LECTIN DM00035|P20693|179-304: 01 19-C248 


C-TYPE LECTIN DM00035|P06734|156-281: C123-C248 


C-TYPE LECTIN DM00035|A46274|248-377: C123-E249 


C-TYPE LECTIN DM00035|P02707|74-202: P125-K250 


Potential Phosphorylation Sites: S12 S28 S241 T3 T65 T66 


Potential Glycosylation Sites: N120 


oo 
cn 

a 
1 


Op 
1 

a 

1 

d 


s 

1 
t 

i 


Signal Peptide: M1-G21 


Signal Peptide: M1-S25 


Signal Peptide: M1-S24 




RECEPTOR CHAIN PRECURSOR TRANSMEMBRANE. PD02382: C155-N170 I 


Amino Acid 
Residues 








































S 

CO 










Incyte 

Polypeptide 

ID 












7523296CD1 




























^1779CD1 










leg 








































CO 











126 



wo 2004/048550 



PCT/US2003/038178 



CO 




127 



wo 2004/048550 



PCT/US2003/038178 



CO 



Analytical Methods 
and Databases 


MOTIFS 1 


MOTIFS 1 


SPSCAN 


HMMER 


TMHMMER 


BLAST_PRODOM 


MOTIFS 


SPSCAN 


HMMER 


HMMER 


HMMER 1 


HMMER 1 


MOTIFS 1 


MOTIFS 1 


SPSCAN 


HMMER 1 


HMMER 1 


HMMER 1 


HMMER 


HMMER 1 






1 


MOTIFS 1 


Signature Sequences, Domains and Motifs 


Potential Phosphorylation Sites: S122 T74 Tl 16 T212 T213 T219 T220 Y174 


Potential Glycosylation Sites: N37 N64 N157 N163 N189 N211 


signal_cleavage: M1-A44 


Signal Peptide: Y15-A44 


Cytosolic domain: I38-R98 
Transmembrane domain: Y15-C37 
Non-cytosolic domain: M1-P14 


CD27 UGAND CD27L CD70 ANTIGEN CYTOKINE TRANSMEMBRANE GLYCOPROTEIN 
SIGNAL ANCHOR PD169505: M1-G64 


Potential Phosphorylation Sites: S9 S52 S74 T63 T96 


1 
i 

3 


;M1-A16 


;M1-T18 


;M1-T21 


:M1-E23 


Potential Phosphorylation Sites: S78 S80 T21 T57 T67 T97 


Potential Glycosylation Sites: N69 


o 

hi 

1 

CO 


:M17-A32 


:M17-G34 


;M17-S36 


:M1.G34 


;R11-G34 


Cytosolic domain: R33-L60 
Transmembrane domain: Q1S-A32 
Non-cytosolic domain: M1-R14 


PLATELET FACTOR 4 VARIANT PRECURSOR PF4VAR1 PROTEOGLYCAN HEPARIN- 
BINDING SIGNAL PD055150: M1-S36 


Potential Phosphorylation Sites: S44 SSO T13 


Signal Peptide; 


Sipal Peptide: 


Signal Peptide; 


Signal Peptide; 


Signal Peptide: 


iSignal Peptide: 


Sipal Peptide: 


Signal Peptide: 


Signal Peptide: 


Amino Acid 
Residues 






oo 

OS 










<^ 

I— • 
































Incyte 

Polypeptide 

ID 






7522014CD1 










7522038CD1 














7523429CD1 


























r— i 










oo 














»— ( 



















128 



wo 2004/048550 



PCT/US2003/038178 



CO 



Analytical Methods 
and Databases 


HMMER 1 


SPSCAN 




BLAST J^ODOM 


BLASTJDOMO 


MOTIFS 1 


MOTIFS 


SPSCAN 


§ 

5 

>-) 

pq 


§ 

PQ 


MOTIFS 


SPSCAN 


HMMERJ>FAM 


HMMERPFAM 


H I 


O CO 


Signature Sequences, Domains and Motifs 


Signal Peptide: M1-S20, M1-A18, M1-P22 


signal_cleavage: M1-S20 1 


Reprolysin family propeptide: H71-G139 


DOMAIN MDC TRANSMEMBRANE METALLOPROTEINASE EGF-LKE DISINTEGRIN- 
LIKE PRECURSOR CYSTEINE-RICH PD000935: Y29-L127 


ZINC; NEUTRAL; METALLOPEPTIDASE; HEMORRHAGIC; 

DM00533|Q05910|1-187:L4-R128 

DM00533|P15167|1-188:M26-R128 

DM00533|P34182|1-185:M26-R128 

DM00533|JC4342|1-188A126-R128 


Potential Phosphorylation Sites: T133, Y107 


Potential Glycosylation Sites: N67, N9I 


-4 
1 

H 

? 

^ i 

0 CO 


PAROTID PRECURSOR SECRETORY SIGNAL GLAND SALIVARY EBNER VON 
SUBMANDIBULAR MINOR PD011295: E19-I188 


P AROTE) SECRETORY PROTEIN 
DM04779|P07743|12-234:V13-N202 
DM04779|B42337|12-235:V13-N202 
DM04779|A42337|12-206:G12-Q204 


H 
o 

00 
CO 

^" 

CO 
In 

CO 

§ - 

IS 

p. . 

O T 


c 

r 
^ 
o 
r 

^ ll'i 


m 

T-H 

ON 

o\ 

o 
< 

a- ^ 

2 ^ 
^ & 

H © 

ft 

- =i 

3 a 


Thrombospondin type 1 domain: S46-C94 


Low-density lipoprotein receptor domain class A: D100-S136 


O 

H 
t 

I 

Ii 

3. 

3 1 

1 -M 
3 1 

? S 

2 & 

a ^ 

II 


Amino Acid 
Residues 


o\ 

T-H 












c 


H 








? 




1 






Incyte 

Polypeptide 

ID 


I7523941CD1 












t 
v 








T- 

c 

V 

r 


I- 

>! 










lag 


8 












c 


4 






c 


3 











129 



wo 2004/048550 



PCTAJS2003/038178 



•3 

^1 



•^3 

1 



00 

I 



s -3 



If Q 

lag 



oo 

r-l 
O 

o 

3 

1 
§ 

f 



CO 

I 

I 

O 

r 



pq 



5^ 



00 



s 

I 
I 

pq 



§ 

3 
1 

pq 



o 



oo 

I 

vo 

ON 

i 



i 

pq 



8 ^ 

ON ON 

ON ON 

OO oo 

s s 

vo vo 

oo 00 

OV ON 

rH r-< 

O O 



i 

pq 

ON 6v 

O ON 



3 



r-- 

oo 00 

vo ^ 

oo oo 

ON ON 

o o 
Q P 



o 



ON 

o 



CO 

to 
oo 

CO* 

vd 



oo 

CO 
ON 

00 

«n 

00 

On 
m 

CO 



CO 

§ CO 

5 oo 



CO. 



85 



2 



i; 

a' 



a 

i 



CO 



O 

I 



00 



130 



wo 2004/048550 



PCT/US2003/038178 




131 



wo 2004/048550 



PCT/US2003/038178 



CO 




132 



wo 2004/048550 



PCTAJS2003/038178 



CO 



Analytical Methods 
and Databases 


i 

i 


MOTIFS 


MOTIFS 1 


SPSCAN 


HMMER 1 


HMMER 


HMMER 


HMMER 1 


HMMER 1 


HMMER 1 


HMMER 


HMMEOFAM 


% 

PQ 


BLASTJPRODOM 


BLASTJDOMO 


BLASTJDOMO 


BLASTJDOMO | 


BLASTJDOMO | 


MOTIFS 1 


MOTIFS 1 


MOTIFS 1 


SPSCAN 


HMMER 


HMMER 1 


< 

c 
c 

.J 
"5 

E 
c 

i 

o 

4 
(> 


0 

3 
1 
3 

r 

3 


CLASS I fflSTOCOMPATIBILrrY ANTIGEN 
DM00083|P06126|2-195:L2-A196 
DM00083|P29017|2-197:F3-KI95 
DM00083|S472*6|2-196:L4-A196 


Potential Phosphorylation Sites: S39, S76, S123, S277, S283, T91, T247 


Potential Olycosylation Sites: N37, N60, N74, NWS 


cn 

O 
1 

& 
i 

i 


:P8-G23 


:P6-G23 


:P4-G23 


:M1-G23 


:M1-A24 


:M1-L28 


;M1-S26 


00 

•1 

? § 

2 i 

3 8 

i § 

1 

3 ^ 


Immunoglobulins and major histocompatibility complex IPB000495: V218-L240, Y273-L290 1 


MHC CLASS I ANTIGEN PRECURSOR SIGNAL CHAIN fflSTOCOMPATIBILrrY ALPHA 
GLYCOPROTEIN PD000050: S26-L197 


CLASS I fflSTOCOMPATIBILrrY ANTIGEN DM00083|P13599|6-195: L11-R194 


CLASS I fflSTOCOMPATIBILrrY ANTIGEN DM00083|P15978|6-207: P6-P202 


CLASS I fflSTOCOMPATIBILrrY ANTIGEN DM00083|S33355|2-203: L15-P202 


IMMUNOGLOBULIN DM00001|P13599|204-286: P203-L286 


Immunoglobulins and major histocompatibility complex proteins signature: Y273-IE79 


Potential Phosphorylation Sites: S63 S161 S204 S267 S295 T130 T264 T307 Y273 


Potential Glycosylation Sites: N12S 


signal_cleavage: M1-G17 


Signal Peptide: M1-A15 


Signal Peptide: M1-A19 


Signal Peptide: 


Signal Peptide: 


Signal Peptide: 


Signal Peptide: 


Signal Peptide; 


Signal Peptide: 


Signal Peptide; 


Amino Acid 
Residues 








cn 




































00 

«-> 






Incyte 

Polypeptide 

ID 








7513053CD1 




































7513086CDI 
















OO 











































133 



wo 2004/048550 



PCT/US2003/038178 



CO 




134 



wo 2004/048550 



PCT/US2003/038178 




135 



wo 2004/048550 



PCT/US2003/038178 



I 

1 

CO 



1 

CO 



oo 



s 

ON 

o 

<M 

cn 
oo 

oo 



8 

cn 

i 

CN 

cn 



ON 

cn 

cn »-H 



cn 
so 
oo 



05 



cn 
vo 
00 



vn 

CN 



cn 
vo 
00 



03 

vo 
VO 

o 

cn 



cn 
vo 



o 

CN 

cn 

5^ 

00 

VD 
SO 
00 



»n 00 
cn t-i 



CN 

o 



00 
cn 



cn 
vo 



CN 

§ 



cn y-i 



00 
vo 

I 

00 
so 



s 

00 
vo 
CN 



P3 

VO 



00 
VO 

r-H 
VO 



vo 



cn 
00 

B 



cn 
00 

T— t 

4 



so 

cn 



so 



CN 



cn 
00 



1 

B 

00 

I 

OV 
»-H 

cn 

f-H 
I 

ON 

r-H 

I 



5 



so 



8 



vo 

m 

CN 

IP 

3n 9 

cn T-r 



cn 
cn 

CN 

cn 

»-H 

g 

00 



ON 
SO 
I 

cn 



ON 

vo 



o 

<5N 
so 



8 

CN 

^ NO 



9 

SO 



cn 



cn 

CN 
00 



8 



cs 
cn 

CN 



ON 



O 
ON 

V 

CN 



8 

EQ 

Q ON 



VO 
SO 



un 

CN 
SO 

so 
so 



CN 

SO 
SO 



so 



I 

cn 
so 



so 

^. 

so 
so 



ON 



8 

vo 
CN 



3 



VO 
O 

s 



cn 
On 
cn 



;::5 

8 

ON 

W5 o 



8 

c^ 
00 

1-H 



136 



wo 2004/048550 



PCTAJS2003/038178 



s 



1 

4> 



I 

I" 
loo 




cT 



8 



§ 



3 

od 

3 



<N 



oo 
cn 



oo 
m 



e 

oo 

s 

o 



58 



VO 

5o 



NO 



OS 



8 

^ NO 

<0 VO 



ON 
OO 



s 



oo 



Ov 

o" 
oo 

i 

VO 

oo 
m 

o 

oo 



CN 



VO 



5 

VO 



8 



cn 

CO 



oo 
oo 
VO 

vrb 
oo 

VO 



i 

ON 

cn oo 
*o ^o 



VO 



5S 



qv 

VO 

oo 



OV 



8 

cn 
cn 

in 



00 



oo 
cn 



VO 
ON 

as 



ON 

?5 



OV 

»o 

VO 
CM 
I 

Ov 

cn 

CM 



OO 
CM 

;c2 



CM 

cn 



VO 

cs 



o 

00 
OO 



;::5 

8 

oo 
»— « 
<^ 

CM 

CM 
»n 1-1 



Si 



cn 



cn 
VO 

o 

s 

cn 

CM 

S 



o 



oo 
oo 

»o 

oo 
tn 

NO 
I 

OO 

oo 
r-- 

cn 
oo 

I 

cn 
VO 



NO 

in 

C3V 

OO 
VO 
00 



CM 
CM 



8 

T-H 

ON 



137 



wo 2004/048550 



PCT/US2003/038178 




138 



wo 2004/048550 



PCTAJS2003/038178 



§ 
I 

ICO 



I 

CO 



VO <N CO rrj O 

OO lO vo oo 

CO c4 cn cs cn 

^ 1-H I «-H 1-H «— ( 

^ JCJ tjF <» «C 

1— ( T— J T-H 

I * I « t 

cs *H cn c4 

r«» oo in oo NO 

»-H en <s CO c^j 

«-H «-H «-H «-H rH 

^ rC CO 

o^ »0 P o> 

r;J m 2 <^ 2 



ci o vo t-H 

CO 

i-< CO ri 



ra CO 
vo oo 

CO C>l CO 



* ^ MJ 

T-H _^ f-H 



-H- vo vo oo cs 
vo oo CO vo r-- 

i-< CO <S CO C^J CO 



« ?i 

^ S -H 

vc> 6d CO ^ S S 

^ CO <N CO CM CO 

*-H r-l i-H «— I *-H *— 4 

JQ 5^ oo S* 

d 5 s ^ B 



SS c>C co" 

^ S s§ S 

cs cs »n 




ir% CO 

^ 5-4 



«0 ON CO 
T-H CO <S 



6 




o o\ O OO 
»0 o> 

CO CO cs 



5 S 



t** vo rsf 

m vo S? 

C< CM 55 

y-4 ^ i—i 

irT 00 VO 10 

m irj vo ON 

CO CI CO CM 



^ vo 
^ VI ov 



c3 »o m o\ 

S ON t:: in 

^ c<\ o m 

NO • . « 

CM vo >g ON 

^ ON 25 o 

»-f vo 00 



00 in 



O V5 
00 



I CN 00 

S; ^- ^ 



CO c<l 

04 CO 

^ yf> 

ON vo 



00 »— ( r-H ^ 



cm" 

00 



00 

i 

in 
00 



00 

T-H 

8 



CO 

s 

CO 
NO 



vo 

vo 
to 

T-H 

I 

CM 
O 

t-H 



8 



vo 



139 



wo 2004/048550 



PCT/US2003/038178 



<D 



a 

o 

c 

c/j 

I 



H 
O 

i 



3 
s 

cn 
1—1 



140 



wo 2004/048550 



PCT/US2003/038178 




141 



wo 2004/048550 



PCT/US2003/038178 




142 



wo 2004/0485S0 



PCT/US2003/038178 



II 



<8 J 



S 

CO 

II 

i 

CO 



I 



i i 



i 



"2 ^ 



« '1 2 




ON 



1 

oo 

ON 
ON 



I 



g 2 S 



I 

en 

CO 

d 

CO 

c 
*S 

I 





* 




I 



I 



I 

Ph 



.3 



I 



143 



wo 2004/048550 



PCT/US2003/038178 



I 



I" 



3 



■i 



-I 



■I 



■I 



■I 



-I 



8^ 



1 



=1 

O 

g 



§ 
g 



Q 



&Q do (U) 



4 

I 



1 



f 

g 



60 

1 



00 



bO bO 



1 



f 



01] 

i 

o 
u 

1 



JO 



a 



a 



o 



a 



si 



o 



^1 



oo 
oo 

CO 



o 
55 



so 
o 



CO 



CO 



00 



oo 



§ 

o 
o 
o 

CO 



cn 

NO 

▼—I 
o 
o 



CO 



m 
cn 
so 

o 
o 



CO 



CO 



CO 



CO 



CO 



o 



1-H 

in 

§ 



CO 



a 

g3 



CO 
ON 



OO 

oo 

CO 



ON 
CO 
CM 
VO 



i 



CO 



so 
as 

s 



so 



so 

13 

i 

oo 



i 



CO 



co 
so 

m 



CO 

VO 
CN 

m 



leg 



VO VO 

col cn 



}8 



ON 
CO 



144 



wo 2004/048550 



PCTAJS2003/038178 



13 



is 

O 
C 



I 



?1 



-1 



^3 



1 



■I 



1 



CO 

cn 



o 



cd 



■I 



1 



1 



=1 



=1 



-I 



•I 



■I 



■I 



Si 

o 



85 
o 



oo 



to 



CO 



I: 
1 



^ .S 

o 
o 



o 

i 



■a 



i 

Q 



4> 



(J 



o 



o 



a 



a 



8 



in 



S 



9 



VO 



ON 
ON 



ON 
OO 



o 



g 

to 



o 



oo 
to 



to 



O 
oo 



o 
oo 



CN 



§ 

cn 



oo 



O 
oo 



o 



vo 



ON 



ON 



5! 



a 



CO 



3^ 

o 

I 

CO 



cn 

§ 
I 

CO 



8 

ON 

oo 



o 

cn 



i 

CO 



oo 
CN 

1-H 

o 
o 
o 



CO 



a 



a 



oo 
cn 
cn 



CN 

to 
to 



vo 
to 

vo 

to 



oo 

i 

oo 
oo 



vo 
o 
o 



vo 
o 
o 
cn 

CN 

^2 



vo 

cn 
oo 
o 
to 
oo 

CN 



cn 
oo 

o 
to 
oo 

CN 



On 
vo 
oo 
cn 
cn 



vo 

8 



VO 

o 
o 
cn 

CN 

to 



5 



so 



cn 
o 

CN 



VO 

s 

CN 



VO 

s 

CN 



VO 
VO 

8 



cn 

«— I 
CN 



>0 

Si 

T-H 

cn 

ON 



00 

§ 



cn 

CN 



145 



wo 2004/048550 



PCT/US2003/038178 



I 



as 



i 

|1 



I 



1 



(20 



oo 



a 



a 



w :5 



o 



CO 



3 



3 



CM 



in 



a 



CO 



CO 



CO 



CO 



CO 



CO 



CO 



CO 



CO 



CO 



00 



a 



5 






o 


vo 












15 





146 



wo 2004/048550 PCT/US2003/038178 



.2 ^ &" 

Si 



2 § 



OS 

o 



•I 



il 



ill 



ON 



■I 



■0 



-I 



(3 



I 



oo 
oo 

o 



s 



00 



PL, 



{3 



O 

O 



en 
oo 

o 



I 

O 

i 



ON 



o 
o 



o 

i 



oo 



ON 

2 



o 

g 



60 



o 

I 



a 



CJ 



a 



CJ 



CJ 



CJ 



o 



5! 



to 



NO 



CO 



8 



VO 



5 



ON 



CO 



S 



OS 



CO 



CO 



00 

s 

ON 

8 

55 



CO 



NO 

00 

s 

ON 

o 
o 
o 



CO 



VO 
00 

s 

ON 

s 



CO 



On 

00 
00 
cn 

o 
o 



CO 



B 

CO 

3 

O 



CO 



o 
cn 

59 

8 



cn 



o 
o 
o 



CO 



CO 



CO 



CO 



1— I 

i 



00 

cn 

o 
o 



CO 



s 

cn 



CO 



00 
cn 



p 

CO 



ON 

00 
m 

o 
p 



CO 



On 
00 
cn 



CO 



g3 



i 

00 
cn 
cn 



00 



?3 



00 
cn 



cn 
cn 

VO 
ON 
ON 



I 

8 



so 
m 

NO 
00 



1 

00 



00 

CN 



cn 
cn 



cn 

B 

cn 



cn 

8 

cn 
tn 



cn 

s 

cn 
m 



S 



in 



VO VO 



s 



s s 



8 



8|S 



S 



3 



3 



3 



147 



